DS 601 Fundamentals of Data Science
This course introduces the principles of data science field as well as review common functionalities of toolkits that are suitable for a data scientist portfolio, with emphasis on data sets cleaning, processing, merging, manipulating, as well as statistical concepts and test for significance in data. In particular, this course focuses on how to read in datasets into data structures, to query and index these structures, to merge multiple data structures, to summarize data into tables, to group data into logical categories, and to manipulate dates. The coverage also includes essentials of machine learning topics such as regression, classification and clustering as well as the importance of storytelling and data visualizations in data science process. This course is an intensive practice course using real-world datasets and variety of statistical methods for both data cleaning activities and compute statistical metrics for data analysis.
DS 602 Statistics and Probability in Data Science
The course introduces the foundations of probability and statistics necessary for modeling and undertaking statistical decisions. The course combines both the mathematical theory and practices of applying this theory to actual data. Coverage includes probability and probability distributions, descriptive statistics random variables, dependence, statistical test and confidence intervals, correlation, regression, entropy and experimental design.
DS 603 Big Data Management Using Hadoop
This course introduces the fundamentals of managing big data and the corresponding standard big data platform that is suitable for addressing substantive needs for handling the different significant amount of data. This course includes an introduction to Hadoop, Hadoop components for data organization, storage, retrieval, and analysis and processing data using Hadoop and querying big data. This course will include topics such as large-scale data analysis, data storage systems, data representations, semi-structured data models. This course will involve hands- on practices using Hadoop to handle real-world datasets.
DS 604 Advanced Database Queries and Data Warehouse in Data Science
This course provides review concepts and practices as applied to database and data warehouse. Coverage includes of developing advanced database queries, distributed databases and performance issues in database system. Moreover, the course introduces state-of-the-art on a set of topics warehouses and dimensional data modeling.
DS 620 Data Visualization & Data Representation Techniques
This course focuses on data visualization techniques for effective communication of data science results through visual analytics dashboards and storytelling. Students will gain a thorough understanding of information visualization design, principles, guidelines, and evaluation criteria. The course covers best practices for creating basic diagrams and charts, statistical charts and tables, and selecting appropriate methods for specific problems. They will learn how to make data-driven decisions, apply interactive techniques to visualizations, and perform data transformations. The course also includes hands-on experience in creating interactive visualizations and dashboards using Python and leading tools such as Power BI, Tableau, and QlikView.
DS 621 Research Methods in Data Science
The purpose of the course is to provide an opportunity for the students to prepare their thesis or research project proposal, excellent writing style and compelling research strategies research, analyze ethical issues that may arise when working datasets. Formulate problem statement, conduct the literature review, identify the type of research method strategy, and plan the overall structure of the study, prepare the project data sets. Also, you will learn about issues of reproducibility, and how to set up your data science product such that it is reproducible.
DS 623 Machine Learning in Data Science
This course introduces machine learning techniques and methods. The coverage includes applied machine learning for the data scientist, the issue of the dimensionality of data, the task of data clustering, clusters evaluation, supervised approaches for building predictive models, data generalization (e.g., cross-validation and over-fitting). This course coverage includes advanced techniques, such as developing ensembles methods, and predictive models’ practical limitations.
DS 624 Text mining in Data Science
This course introduces text mining and manipulation techniques. Course coverage includes working with text toolkits, the text structure for both human and machines, text manipulation needs, regular expressions, text cleaning, preparing text for machine algorithms processing, natural language processing techniques, classifications, and topic and similarity detection in documents. This course involves hands-on practices using real-world documents.
DS 625 Social Network Analysis in Data Science
This course involves hands-on practices using real-world documents. The course introduces the process of modeling the social structures as networks, analyze the connectivity of social networks, measure the importance of nodes with the social network such as centrality and closeness, network evolution over time, a model of network generation and link problem prediction.
DS 626 Management Data Science
The course applies the tools and techniques of management science to engineering problems. It covers linear programming, sensitivity analysis, waiting line, decision analysis, forecasting. The course makes use of available optimization software and spreadsheets to solve practical problems through case studies.
DS 628 Ethics in Data Science
This course introduces ethics, policies and legal issues that face computer professionals and data scientist while working with information systems and datasets. This course will discuss Intellectual Property, privacy debates, laws and professional ethics governing these issues while working with computer systems, as well as the course, will examine the related cyber security issues. Specific case studies and assignments will be used to illustrate the discussed issues.
DS 631 Data Science Thesis
The purpose of this thesis is to provide an opportunity to analyze large-scale real-world problems according to the scientific methods. The student demonstrates an ability to utilize and apply various data science opportunity and techniques to design, implement, test, and evaluate a complex problem, deriving insights from data and sharing ideas with other stakeholders. The focus of the thesis depends on the selected elective courses and therefore corresponds Data science product outcome.