BIL3XXX | Introduction to Data Science | 3+0+0 | 4 | ||||
Year / Semester | Spring semester | ||||||
Level of Course | First Cycle | ||||||
Status | Elective | ||||||
Department | Computer Engineering | ||||||
Prerequisites and co-requisites | N/A | ||||||
Mode of Delivery | Face to face | ||||||
Contact hours | 14 weeks | ||||||
Lecturer | Asst. Prof. Dr. Murat AYKUT | ||||||
Co-Lecturer | |||||||
Language of instruction | Turkish | ||||||
Internship | N/A | ||||||
Objectives of the Course | |||||||
The course intends to teach the students for the fundamentals of data science, data preprocessing operations, data reduction methods, learning approaches, and data visualization techniques with practical code examples | |||||||
Learning Outcomes | CTPO | TOA | |||||
Upon successful completion of the course, the students will be able to | |||||||
LO - 1 : | Learns the basic concepts of data science. | 2, 4 | 1 | ||||
LO - 2 : | Gain knowledge on data preprocessing and data reduction methods. | 2, 4 | 1,3 | ||||
LO - 3 : | Gain knowledge on learning from data approaches. | 2, 4 | 1,3 | ||||
LO - 4 : | Gain knowledge on data visualization. | 2, 4 | 1,3 | ||||
CTPO : Contribution to programme outcomes, TOA : Type of assessment (1: written exam, 2: Oral exam, 3: Homework assignment, 4: Laboratory exercise/exam, 5: Seminar / presentation, 6: Term paper), LO : Learning Outcome | |||||||
Contents of the Course | |||||||
Introduction; Data Types; Data Preparation; Dealing with Missing Values; Dealing with Noisy Data; Data Reduction; Data Augmentation; Feature Selection; Instance Selection; Outlier Removal; Discretization; Supervised Learning; Regression Modeling; Unsupervised Learning; Model Evaluation; Association Rules; Data Summarization and Visualization. | |||||||
Course Syllabus | |||||||
Week | Subject | Related Notes / Files | |||||
Week 1 | Introduction, Data Types | ||||||
Week 2 | Data Preprocessing, Missing Value, Noisy Data | ||||||
Week 3 | Data Reduction: Feature Selection, Feature Extraction | ||||||
Week 4 | Data Reduction: Case Reduction, Feature Discretization | ||||||
Week 5 | Data Augmentation | ||||||
Week 6 | Outlier Removal | ||||||
Week 7 | Supervised Learning: Logistic Regressiob, kNN, Decision Trees | ||||||
Week 8 | Supervised Learning: Naive Bayes, SVM, Ensemble Learning | ||||||
Week 9 | Midterm exam | ||||||
Week 10 | Regression Modelling | ||||||
Week 11 | Unsupervised Learning: k-Means, Expactation-Maximization, Hierarchical Clustering | ||||||
Week 12 | Model Evaluation | ||||||
Week 13 | Association Rules: Apriori, FP-Growthi Collaborative Filtering | ||||||
Week 14 | Fundamentals of Text Mining | ||||||
Week 15 | Data Summarization and Visualization | ||||||
Week 16 | Final exam | ||||||
Textbook / Material | |||||||
1 | Chantal D Larose, Daniel T. Larose, "Data Science Using Python and R", Wiley, 2019, 256 pages. | ||||||
Recommended Reading | |||||||
2 | Salvador Garcia, Julian Luengo, Francisco Herrera, "Data Preprocessing in Data Mining", Springer, 2015, 320 pages. | ||||||
3 | Laura Igual, Santi Seguí, "Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications", Springer, 2017, 218 pages. | ||||||
Method of Assessment | |||||||
Type of work | Week No | Date | Duration (ho | Weight (%) | |||
Mid-term exam | 9 | 2 | 50 | ||||
Quiz | |||||||
Homework | |||||||
Term Project | |||||||
Final exam | 16 | 2 | 50 | ||||
Student Work Load and its Distribution | |||||||
Type of work | Duration (hours pw) | No of weeks / Number of activity | Hours in total per term | ||||
Lectures (Interactive) | 3 | 14 | 42 | ||||
Own (personal) studies outside class | 3 | 14 | 42 | ||||
Own study for mid-term exam | 6 | 1 | 6 | ||||
Mid-term exam | 2,0 | 1 | 2 | ||||
Quiz | 0 | ||||||
Own study for final exam | 6 | 1 | 6 | ||||
End-of-term exam | 2 | 1 | 2 | ||||
Other 1 | 0 | ||||||
Total work load | 100 |
All announcements will be made, and resources will be shared via course Piazza page (join as a student to class "BIL 3020").