Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Sign in
Toggle navigation
Menu
Open sidebar
Institut für Informatik
dbis
dbis-teaching
data-engineering-analytics-notebooks
Commits
18e0b49d
Commit
18e0b49d
authored
Oct 05, 2021
by
Eva Zangerle
Browse files
cleaned notebooks
parent
4768fa62
Changes
6
Expand all
Show whitespace changes
Inline
Side-by-side
Showing
6 changed files
with
1997 additions
and
556 deletions
+1997
-556
data/big-random.csv
data/big-random.csv
+8
-4
notebooks/01_introduction.ipynb
notebooks/01_introduction.ipynb
+4
-22
notebooks/03_datasets.ipynb
notebooks/03_datasets.ipynb
+1103
-196
notebooks/05_dataset_analysis_cleaning.ipynb
notebooks/05_dataset_analysis_cleaning.ipynb
+255
-247
notebooks/06_dataset_analysis_exploratory.ipynb
notebooks/06_dataset_analysis_exploratory.ipynb
+623
-84
notebooks/07_feature_engineering.ipynb
notebooks/07_feature_engineering.ipynb
+4
-3
No files found.
data/big-random.csv
View file @
18e0b49d
...
...
@@ -72,7 +72,8 @@ pedagogizzavano,12,vibratrice,92,striglio,58
epigrammino,56,compravenduti,84,arzigogolava,55
inibitoria,26,aerolinee,54,varero,88
squarcerai,45,quiescenze,12,scuoieremo,70
fantasmagorici,28,immischiavate,44,schiavizzammo,97,sfilzarono,49
fantasmagorici,28,immischiavate,44
schiavizzammo,97,sfilzarono,49
interagiste,50,repentagli,72,attendato,95
crossiste,17,maiolicheranno,24,espugniamo,64
ribattezzava,36,contestataria,98,appezzino,60
...
...
@@ -710,7 +711,8 @@ barcheggiasti,36,subaccolleranno,44,libanizzavo,41
salesiane,42,tutelai,40,sublimare,56
intridevi,75,mazzuolavo,36,polemizzino,72
resettando,58,strisciato,46,insaldai,62
aspirasse,15,imbozzimatrici,70,incanalante,93,succhieremo,41
aspirasse,15,imbozzimatrici,70
incanalante,93,succhieremo,41
saccarometriche,18,stremaste,12,hindi,19
immergano,99,estolta,99,resistenti,68
schiavo,13,carrellavamo,50,scimmiotti,23
...
...
@@ -8094,7 +8096,8 @@ querelavate,97,carella,49,artocebo,74
intercludeva,37,brezzoline,87,viceprovincia,16
sviarono,49,disalveante,37,raschierete,12
squincio,16,biascicona,93,solisti,70
rinegoziante,50,circoncidiamo,83,stringavate,79,stipularono,34
rinegoziante,50,circoncidiamo,83
stringavate,79,stipularono,34
disaeriamo,94,sfiorisce,37,mesterebbe,46
inosservanti,45,esporrei,98,angustiosi,21
spalmeresti,18,interromperai,20,notifichiate,54
...
...
@@ -12292,7 +12295,8 @@ smozzante,98,scalfendo,29,indirizzavo,23
sprovvedevi,90,nepentacee,13,raffinasti,69
raddobbassi,10,tortoreggiate,24,telecomandavamo,27
stremeranno,89,trastullassi,79,rinsavireste,20
palettizzassimo,50,colliquerei,26,scarseggiata,93,schioppetteria,63
palettizzassimo,50,colliquerei,26
scarseggiata,93,schioppetteria,63
imperammo,47,baluginano,61,premoriranno,59
rivisita,12,agguanteresti,47,isotropo,87
polifonie,10,attufferai,17,legiferavo,36
notebooks/01_introduction.ipynb
View file @
18e0b49d
...
...
@@ -10,16 +10,6 @@
"\n",
"Eva Zangerle\n",
"\n",
"## Overview of Notebooks\n",
"* Datasets: [3_datasets.ipynb](3_datasets.ipynb)\n",
"* Data Preparation and Quality: [4_data_preparation_quality.ipynb](4_data_preparation_quality.ipynb)\n",
"* Feature Engineering: [5_feature_engineering.ipynb](5_feature_engineering.ipynb)\n",
"* Dataset Analyses: [6_dataset_analyses.ipynb](6_dataset_analyses.ipynb)\n",
"* Hypotheses and Evaluation: [7_hypotheses_evaluation.ipynb](7_hypotheses_evaluation.ipynb)\n",
"* Modeling and Prediction: [8_modeling_prediction.ipynb](8_modeling_prediction.ipynb)\n",
"* Reproducible Research: [9_reproducible_research.ipynb](9_reproducible_research.ipynb)\n",
"\n",
"\n",
"## General Notes\n",
"* Code is partly taken from further sources, such as books.\n",
"* Sources are annotated (and acknowledged!) as follows:\n",
...
...
@@ -43,19 +33,11 @@
"## Useful python stuff\n",
"* Startup files: https://ipython.readthedocs.io/en/stable/interactive/tutorial.html#startup-files\n",
"* tqdm progress bars (also for Jupyter): https://github.com/tqdm/tqdm\n",
"* nbval for validating Jupyter notebooks: https://github.com/computationalmodelling/nbval\n",
"* nbqa for quality assurance for Jupyter notebooks: https://github.com/nbQA-dev/nbQA\n",
"\n",
"## Further tools\n",
"* jq command linen json processor: https://stedolan.github.io/jq/\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f6860d72-11e4-4574-b20b-7884a6653abc",
"metadata": {},
"outputs": [],
"source": [
" "
"* jq command line json processor: https://stedolan.github.io/jq/"
]
}
],
...
...
@@ -75,7 +57,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.
6
"
"version": "3.9.
7
"
}
},
"nbformat": 4,
...
...
notebooks/03_datasets.ipynb
View file @
18e0b49d
This source diff could not be displayed because it is too large. You can
view the blob
instead.
notebooks/05_dataset_analysis_cleaning.ipynb
View file @
18e0b49d
This diff is collapsed.
Click to expand it.
notebooks/06_dataset_analysis_exploratory.ipynb
View file @
18e0b49d
This diff is collapsed.
Click to expand it.
notebooks/07_feature_engineering.ipynb
View file @
18e0b49d
...
...
@@ -19,10 +19,11 @@
"source": [
"# import required packages\n",
"from pprint import pprint\n",
"from scipy.stats import gmean, hmean\n",
"from sklearn.datasets import load_digits\n",
"\n",
"from matplotlib import cm\n",
"from matplotlib.colors import ListedColormap"
"from matplotlib.colors import ListedColormap\n",
"from scipy.stats import gmean, hmean\n",
"from sklearn.datasets import load_digits"
]
},
{
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment