-
Alexander Roßmann authored
The repo was initially set up with prepared ML services. This also includes the description of the repo in a README, style guides and requirements
Alexander Roßmann authoredThe repo was initially set up with prepared ML services. This also includes the description of the repo in a README, style guides and requirements
Style Guide
for this Repository
All notebooks shall adhere to this guide.
File structure:
├─── Name of ML use case
+-- data.csv
+-- notebook.ipynb
\-- README.md
The names of the files shall be as shown above.
If there are multiple data files you can use a data folder.
Make sure you adjust your filepath stings in your notebook!
For notebooks
- name of notebook is notebook.ipynb
- all import Statements on Top of the the file
- everything is in english: Variables, comments, figure title, figure axis, and Markdown text
adhere to PEP 8
https://www.youtube.com/watch?v=D4_s3q038I0&t=1182s
! use automatic formatting with something like autopep8 or black
howto for jupyter notebook
notebook structure
a notebook should have this structure
- Business Understanding
- Data and Data Understanding
2.1. Import of Relevant Modules
2.2. Read Data
2.3. Data Cleaning
2.4. Descriptive Analytics
2.4.1. Continous Features
2.4.2. Categorical Features - Data Preparation
3.1. Reduce Customer ID
3.2. Recoding of Categorical Variables
3.3. Test for Multicollinearity
3.4. Feature Scaling
3.5. Undersampling
3.6. Create Test and Training Data - Modelling and Evaluation
4.1. Logistic Regression
4.2 Evaluation
4.3. Interpretation
4.4. Model Optimization - Deployment
Variable names
use snake_case always no camelCase !!
Variables:
data_raw = pd.read_csv("data.csv")
data_cleaned
data_1
data_2
data_3
y = data["Target Variable"]
x = data.drop(["TargetVariable"])
x = scaler.fit_transform(x)
X_train, X_test, y_train, y_test = train_test_split(X_resampled, y_resampled, random_state=110)
model_lin_regression = LinearRegression()
model_lin_regression.fit(x_train,y_train)
model_nearest_neighbors = NearestNeighbors().fit(x_train)