01_introduction.ipynb 2.61 KB
Newer Older
Eva Zangerle's avatar
Eva Zangerle committed
1
2
3
4
5
6
7
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "500bd02c-eee8-45e6-a301-3482638767de",
   "metadata": {},
   "source": [
8
    "# Data Engineering and Analytics\n",
Eva Zangerle's avatar
Eva Zangerle committed
9
    "Master Software Engineering\n",
10
    "\n",
Eva Zangerle's avatar
Eva Zangerle committed
11
12
13
14
    "Eva Zangerle\n",
    "\n",
    "## General Notes\n",
    "* Code is partly taken from further sources, such as books.\n",
15
    "* Sources are annotated (and acknowledged!) as follows:\n",
Eva Zangerle's avatar
Eva Zangerle committed
16
    "    * (CleaningData): Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools; David Mertz; Packt Publishing, 2021; [Github repo](https://github.com/PacktPublishing/Cleaning-Data-for-Effective-Data-Science/)\n",
Eva Zangerle's avatar
Eva Zangerle committed
17
18
    "    * (FeatureEng): Feature Engineering for Machine Learning; Alice Zheng and Amanda Casari; O'Reilly, 2018; [Github repo](https://github.com/alicezheng/feature-engineering-book)\n",
    "    * (DSHandbook): Python Data Science Handbook; Jake VanderPlas; O'Reilly, 2016; [Github repo](https://github.com/jakevdp/PythonDataScienceHandbook)\n",
19
    "* Unless marked otherwise, code was written by Eva Zangerle.\n",
Eva Zangerle's avatar
Eva Zangerle committed
20
    "* I deliberately mix different Python packages (e.g., for visualization matplotlib, pandas and seaborn) to showcase their use.\n",
21
22
23
24
25
26
27
    "\n",
    "\n",
    "\n",
    "## Virtual environments\n",
    "\n",
    "![xkcd python environment](https://imgs.xkcd.com/comics/python_environment.png)\n",
    "\n",
Eva Zangerle's avatar
Eva Zangerle committed
28
    "Comic taken from XKCD Comics https://xkcd.com/1987/ (CC-BY)\n",
29
30
31
32
33
    "\n",
    "\n",
    "\n",
    "Good tutorial on pipenv and jupyter(-lab): https://towardsdatascience.com/virtual-environments-for-data-science-running-python-and-jupyter-with-pipenv-c6cb6c44a405#\n",
    "\n",
Eva Zangerle's avatar
Eva Zangerle committed
34
35
    "\n",
    "## Useful python stuff\n",
36
37
    "* Startup files: https://ipython.readthedocs.io/en/stable/interactive/tutorial.html#startup-files\n",
    "* tqdm progress bars (also for Jupyter): https://github.com/tqdm/tqdm\n",
Eva Zangerle's avatar
Eva Zangerle committed
38
39
    "* nbval for validating Jupyter notebooks: https://github.com/computationalmodelling/nbval\n",
    "* nbqa for quality assurance for Jupyter notebooks: https://github.com/nbQA-dev/nbQA\n",
40
41
    "\n",
    "## Further tools\n",
Eva Zangerle's avatar
Eva Zangerle committed
42
    "* jq command line json processor: https://stedolan.github.io/jq/"
Eva Zangerle's avatar
Eva Zangerle committed
43
   ]
Eva Zangerle's avatar
Eva Zangerle committed
44
45
46
47
  }
 ],
 "metadata": {
  "kernelspec": {
48
   "display_name": "Python 3 (ipykernel)",
Eva Zangerle's avatar
Eva Zangerle committed
49
50
51
52
53
54
55
56
57
58
59
60
61
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
Eva Zangerle's avatar
Eva Zangerle committed
62
   "version": "3.9.7"
Eva Zangerle's avatar
Eva Zangerle committed
63
64
65
66
67
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}