The internet offers abundant possibilities to collect data (e.g., from social networks, from digital media providers, from price comparison websites, from online platforms) that can be used in empirical research projects and/or provide business value. After successful completion of this course, students will be able to:
- Identify online data sources and evaluate their value in the context of a specific research question or business problem
- Assess the terms and conditions for collecting, storing, and sharing data
- Collect data via web scraping and Application Protocol Interfaces (APIs) by mixing, extending and repurposing code snippets
- Transform semi-structured JSON data to structured data sets for statistical analysis (“parsing”)
- Store and manage data using file-based systems and databases
- Draft, execute, monitor and audit online data collections locally and remotely
- Document and archive collected data, and make it available for public (re)use
- Track and share progress on the course’s learning goals
Students pass this course if the final course grade (i.e., the weighted average of the individual components; weights indicated above) is ≥ 5.5, and the exam is passed (≥ 5.5).
- Team project (4-5 team members) (40% + 10% individual assessment on the basis of self- and peer assessment1)
- Share individual progress and learnings (e.g., open science contributions like tutorials or code snippets in the form of pull requests to GitHub, maintaining a public FAQ/blog, sharing one’s progress with the group) (10%)
- Computer exam (40%)
1Self- and peer-assessment: The team project is subject to self- and peer assessment, i.e., students’ grades will be corrected upwards or downwards, depending on their own contribution to the overall team effort. Students provide written feedback to each other once during the course, and score themselves and their team members on, among others, the quantity and quality of their contributions.
- Hybrid format: Jupyter notebooks or pre-recorded web clips for preparation and self-paced lab sessions; live streams on Zoom for feedback and joint coding sessions (recordings will be made available)
- Modern content: copy-paste code snippets and demos from the course page, access code on GitHub, start projects with templates from GitHub, share screens and solve problems
- Interactive, immersive and student-centred: live coding, debates, open-source content contributions, scraping real websites and accessing real APIs
Student profile / prerequisites
- The course is instructed to MSc students in the Marketing Analytics (TiSEM) program.
- The course expects students to have acquired working knowledge in Python (e.g., from introductory courses at Datacamp), including an understanding of data types (e.g., characters, integers), loops, if-else statements, and functions.
- The course welcomes novices, of whom extra preparation prior to the start of the course is expected. Preparation material will be shared with students in advance in the form of Jupyter Notebooks or course recommendations at Datacamp. Novices may further benefit from following other courses at Tilburg University in which Python is used, for example, Research Skills: Data Processing and Research Skills: Data Processing Advanced.
- Students are recommended to use their own computer for this course (Windows, Mac or Linux). Android/Chromebook/iOS devices are not supported.
Enrollment and Obtaining Course Credits
- The course (3 ECTS) will be taught in the Marketing Analytics Program at Tilburg University (please check Osiris for the specifics).
- Interested Research Master or PhD students who seek to advance their data collection skills can audit this course upon the approval of the instructor and their coordinator.