Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.

Style Guide

for this Repository

All notebooks shall adhere to this guide.

File structure:

├─── Name of ML use case  
    +-- data.csv  
    +-- notebook.ipynb  
    \-- README.md  

The names of the files shall be as shown above.
If there are multiple data files you can use a data folder.
Make sure you adjust your filepath stings in your notebook!

For notebooks

  • name of notebook is notebook.ipynb
  • all import Statements on Top of the the file
  • everything is in english: Variables, comments, figure title, figure axis, and Markdown text

adhere to PEP 8

https://www.youtube.com/watch?v=D4_s3q038I0&t=1182s
! use automatic formatting with something like autopep8 or black
howto for jupyter notebook

notebook structure

a notebook should have this structure

  1. Business Understanding
  2. Data and Data Understanding
    2.1. Import of Relevant Modules
    2.2. Read Data
    2.3. Data Cleaning
    2.4. Descriptive Analytics
    2.4.1. Continous Features
    2.4.2. Categorical Features
  3. Data Preparation
    3.1. Reduce Customer ID
    3.2. Recoding of Categorical Variables
    3.3. Test for Multicollinearity
    3.4. Feature Scaling
    3.5. Undersampling
    3.6. Create Test and Training Data
  4. Modelling and Evaluation
    4.1. Logistic Regression
    4.2 Evaluation
    4.3. Interpretation
    4.4. Model Optimization
  5. Deployment

Variable names

use snake_case always no camelCase !!

Variables:

data_raw = pd.read_csv("data.csv")
data_cleaned
data_1
data_2
data_3
y = data["Target Variable"]
x = data.drop(["TargetVariable"])
x = scaler.fit_transform(x) X_train, X_test, y_train, y_test = train_test_split(X_resampled, y_resampled, random_state=110)
model_lin_regression = LinearRegression()
model_lin_regression.fit(x_train,y_train)
model_nearest_neighbors = NearestNeighbors().fit(x_train)