1. Business Understanding¶¶
Diese Fallstudie soll Technologieunternehmen bei der Entwicklung und Verbesserung von Komponenten für autonome Fahrzeuge unterstützen. Dabei wirft diese Fallstudie die Frage auf: In welchen Bereichen liegen die Schwächen der getesteten Systeme und wie lassen sich diese einordnen? Die Frage wird beantwortet, indem die Gründe für den Eingriff des menschlichen Testfahrers in das System klassifiziert werden. Dabei sollen die Gründe in verschiedene Kategorien eingeteilt werden:
- Software Probleme
- Hardware Probleme
- Software und Hardware Probleme (Probleme, die in Kombination auftreten)
- Probleme, die durch Objekte der Verkehrssteuerung verursacht werden (z. B. Fehlfunktionen von Lichtsignalanlagen)
- Probleme, die durch andere Verkehrsteilnehmer verursacht werden
- äußere Einflüsse (einschließlich z. B. Blockaden, verborgene Elemente, Wetter- und Straßenbedingungen)
- sonstige Probleme
2. Daten und Datenverständnis¶
Der zugrundeliegende Datensatz wurde von Kaggle bezogen und zeigt notwendige Eingriffe eines Fahrers in autonom fahrende Fahrzeugtechnologie. Er wurde vom 1. Dezember 2018 bis zum 30. November 2019 aufgezeichnet und steht unter der Lizenz von U.S. Government Works. Die Datenverarbeitung erfolgt mit Hilfe eines Jupyter-Notebooks, das in der populärsten Data-Science-Plattform "Anaconda" enthalten ist. Der Datensatz liegt im Format 8885 x 9 vor. Die folgenden Daten wurden erfasst:
- Hersteller
- Zulassungsnummer
- Datum
- Fahrzeug-Identifikationsnummer
- Möglichkeit des Betriebs ohne Fahrer
- Ist ein Fahrer vor Ort?
- Unterbrechung des autonomen Fahrens durch AV-System, Testfahrer, Fernsteuerung oder Beifahrer
- Ort des Vorfalls: Autobahn, Schnellstraße, Landstraße, Straße oder Parkplatz
- Beschreibung der Ursachen
2.1. Import von relevanten Modulen¶
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
pd.__version__
'1.3.4'
2.2 Daten einlesen¶
raw_data = pd.read_csv('https://storage.googleapis.com/ml-service-repository-datastorage/Improvement_of_components_for_autonomous_motor_vehicles_data.csv')
raw_data.head()
Manufacturer | Permit Number | DATE | VIN NUMBER | VEHICLE IS CAPABLE OF OPERATING WITHOUT A DRIVER\n(Yes or No) | DRIVER PRESENT\n(Yes or No) | DISENGAGEMENT INITIATED BY\n(AV System, Test Driver, Remote Operator, or Passenger) | DISENGAGEMENT\nLOCATION\n(Interstate, Freeway, Highway, Rural Road, Street, or Parking Facility) | DESCRIPTION OF FACTS CAUSING DISENGAGEMENT | |
---|---|---|---|---|---|---|---|---|---|
0 | AImotive Inc. | AVT003 | 12.06.2018 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... |
1 | AImotive Inc. | AVT003 | 12.10.2018 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... |
2 | AImotive Inc. | AVT003 | 12.10.2018 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... |
3 | AImotive Inc. | AVT003 | 04.23.2019 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... |
4 | AImotive Inc. | AVT003 | 05.14.2019 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver to the exit lane: risk of... |
2.3. Daten bereinigen¶
raw_data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 8885 entries, 0 to 8884 Data columns (total 9 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Manufacturer 8885 non-null object 1 Permit Number 8885 non-null object 2 DATE 8884 non-null object 3 VIN NUMBER 8884 non-null object 4 VEHICLE IS CAPABLE OF OPERATING WITHOUT A DRIVER (Yes or No) 8884 non-null object 5 DRIVER PRESENT (Yes or No) 8884 non-null object 6 DISENGAGEMENT INITIATED BY (AV System, Test Driver, Remote Operator, or Passenger) 8884 non-null object 7 DISENGAGEMENT LOCATION (Interstate, Freeway, Highway, Rural Road, Street, or Parking Facility) 8884 non-null object 8 DESCRIPTION OF FACTS CAUSING DISENGAGEMENT 8884 non-null object dtypes: object(9) memory usage: 624.9+ KB
raw_data.describe(include="all")
Manufacturer | Permit Number | DATE | VIN NUMBER | VEHICLE IS CAPABLE OF OPERATING WITHOUT A DRIVER\n(Yes or No) | DRIVER PRESENT\n(Yes or No) | DISENGAGEMENT INITIATED BY\n(AV System, Test Driver, Remote Operator, or Passenger) | DISENGAGEMENT\nLOCATION\n(Interstate, Freeway, Highway, Rural Road, Street, or Parking Facility) | DESCRIPTION OF FACTS CAUSING DISENGAGEMENT | |
---|---|---|---|---|---|---|---|---|---|
count | 8885 | 8885 | 8884 | 8884 | 8884 | 8884 | 8884 | 8884 | 8884 |
unique | 28 | 27 | 3711 | 289 | 5 | 4 | 4 | 11 | 469 |
top | Toyota Research Institute | AVT050 | 3/28/2019 | JTHDU1EF3G5020098 | No | Yes | Test Driver | Street | Safety Driver proactive disengagement. |
freq | 2947 | 2947 | 59 | 900 | 4369 | 4934 | 6037 | 4668 | 1780 |
raw_data.rename(columns = {'VEHICLE IS CAPABLE OF OPERATING WITHOUT A DRIVER\n(Yes or No)':'OPERATING WITHOUT DRIVER', 'DRIVER PRESENT\n(Yes or No)':'DRIVER PRESENT', 'DISENGAGEMENT INITIATED BY\n(AV System, Test Driver, Remote Operator, or Passenger)':'DISENGAGEMENT INITIATED BY', 'DISENGAGEMENT\nLOCATION\n(Interstate, Freeway, Highway, Rural Road, Street, or Parking Facility)':'DISENGAGEMENT LOCATION', 'DESCRIPTION OF FACTS CAUSING DISENGAGEMENT':'FACTS CAUSING DISENGAGEMENT'}, inplace = True)
Fehlende Werte¶
raw_data.isnull().sum()
Manufacturer 0 Permit Number 0 DATE 1 VIN NUMBER 1 OPERATING WITHOUT DRIVER 1 DRIVER PRESENT 1 DISENGAGEMENT INITIATED BY 1 DISENGAGEMENT LOCATION 1 FACTS CAUSING DISENGAGEMENT 1 dtype: int64
data_no_mv=raw_data.dropna(axis=0)
data_no_mv.isnull().sum()
Manufacturer 0 Permit Number 0 DATE 0 VIN NUMBER 0 OPERATING WITHOUT DRIVER 0 DRIVER PRESENT 0 DISENGAGEMENT INITIATED BY 0 DISENGAGEMENT LOCATION 0 FACTS CAUSING DISENGAGEMENT 0 dtype: int64
data_no_mv.describe(include='all')
Manufacturer | Permit Number | DATE | VIN NUMBER | OPERATING WITHOUT DRIVER | DRIVER PRESENT | DISENGAGEMENT INITIATED BY | DISENGAGEMENT LOCATION | FACTS CAUSING DISENGAGEMENT | |
---|---|---|---|---|---|---|---|---|---|
count | 8884 | 8884 | 8884 | 8884 | 8884 | 8884 | 8884 | 8884 | 8884 |
unique | 27 | 26 | 3711 | 289 | 5 | 4 | 4 | 11 | 469 |
top | Toyota Research Institute | AVT050 | 3/28/2019 | JTHDU1EF3G5020098 | No | Yes | Test Driver | Street | Safety Driver proactive disengagement. |
freq | 2947 | 2947 | 59 | 900 | 4369 | 4934 | 6037 | 4668 | 1780 |
Datenaufbereitung und Kennzeichnung¶
data_no_mv['DISENGAGEMENT INITIATED BY'].value_counts()
Test Driver 6037 AV System 2698 Vehicle Operator 81 Test driver 68 Name: DISENGAGEMENT INITIATED BY, dtype: int64
data_no_mv['FACTS CAUSING DISENGAGEMENT'].nunique()
469
data_no_mv.head()
Manufacturer | Permit Number | DATE | VIN NUMBER | OPERATING WITHOUT DRIVER | DRIVER PRESENT | DISENGAGEMENT INITIATED BY | DISENGAGEMENT LOCATION | FACTS CAUSING DISENGAGEMENT | |
---|---|---|---|---|---|---|---|---|---|
0 | AImotive Inc. | AVT003 | 12.06.2018 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... |
1 | AImotive Inc. | AVT003 | 12.10.2018 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... |
2 | AImotive Inc. | AVT003 | 12.10.2018 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... |
3 | AImotive Inc. | AVT003 | 04.23.2019 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... |
4 | AImotive Inc. | AVT003 | 05.14.2019 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver to the exit lane: risk of... |
data_no_mv['FACTS CAUSING DISENGAGEMENT'].value_counts()
Safety Driver proactive disengagement. 1780 Disengage for unwanted maneuver of the vehicle caused by a planning discrepancy while generating an appropriate trajectory 805 Automatic disengagement caused by planner fault. 742 Disengage due to operator discomfort 636 Disengage for a software fault due to a potential performance issue with a software component of the self-driving system (including third party software components) 482 ... Planning Logic: planner inadequately yields for cross traffic agent with right-of-way 1 Planning Logic: planned trajectory fails to avoid vehicle stopped ahead intersection 1 Object Perception: inaccurate perception of animal slowly crossing road leads to planned trajectory overlap 1 Planning Logic: incorrect behavior prediction for oncoming vehicle results in a planned trajectory that overlaps with the vehicle 1 Planning discrepancy; system planned incorrect trajectory to avoid oncoming traffic 1 Name: FACTS CAUSING DISENGAGEMENT, Length: 469, dtype: int64
data_no_mv['DISENGAGEMENT INITIATED BY'].value_counts()
Test Driver 6037 AV System 2698 Vehicle Operator 81 Test driver 68 Name: DISENGAGEMENT INITIATED BY, dtype: int64
data_reduced = data_no_mv.loc[0:1999, :]
data_reduced.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 2000 entries, 0 to 1999 Data columns (total 9 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Manufacturer 2000 non-null object 1 Permit Number 2000 non-null object 2 DATE 2000 non-null object 3 VIN NUMBER 2000 non-null object 4 OPERATING WITHOUT DRIVER 2000 non-null object 5 DRIVER PRESENT 2000 non-null object 6 DISENGAGEMENT INITIATED BY 2000 non-null object 7 DISENGAGEMENT LOCATION 2000 non-null object 8 FACTS CAUSING DISENGAGEMENT 2000 non-null object dtypes: object(9) memory usage: 156.2+ KB
data_reduced['FACTS CAUSING DISENGAGEMENT'].nunique()
92
data_reduced.drop_duplicates(subset ="FACTS CAUSING DISENGAGEMENT",
keep = 'first', inplace = True)
/Users/yassermurtada/opt/anaconda3/lib/python3.8/site-packages/pandas/util/_decorators.py:311: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy return func(*args, **kwargs)
data_reduced = data_reduced.reset_index()
data_reduced['FACTS CAUSING DISENGAGEMENT'] = data_reduced['FACTS CAUSING DISENGAGEMENT'].astype('str')
data_reduced['FACTS CAUSING DISENGAGEMENT'].dtype
dtype('O')
import re
software_keywords = ["software performance", "software fault", "software discrepancy"
"trajectory planning", "planning discrepancy", "planning error",
"wrong lane change suggestion", "wrong lane association", "data recording",
"improper lane-change plan", "undesirable manuever", "undesirable yielding maneuver",
"outside of rate requirements", "merged poorly", "mapping issue", "software issue",
"poor trajectory across lanes", "incorrect assessment", "incorrect behavior",
"unprotected", "Poor lane change", "very wide", "wrong object prediction", "undesired motion",
"unwanted maneuver", "perception discrepancy", "ghost object prediction",
"driving faster than driver expected", "expected path",
"not initialized correctly", "software module", "perception mismatch", "estimation",
"planner fault", "unstable"]
data_no_mv.loc[data_no_mv['FACTS CAUSING DISENGAGEMENT'].str.contains('|'.join(software_keywords), na=False, case=False)
, 'Problem class'] = "Software"
/Users/yassermurtada/opt/anaconda3/lib/python3.8/site-packages/pandas/core/indexing.py:1684: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy self.obj[key] = infer_fill_value(value) /Users/yassermurtada/opt/anaconda3/lib/python3.8/site-packages/pandas/core/indexing.py:1817: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy self._setitem_single_column(loc, value, pi)
hardware_keywords = ["hardware performance", "hardware diagnostics", "controls diagnostics", "actor",
"yield to other actors", "Hardware irregularity", "weather conditions", "Autobox",
"performance issue with a hardware component"]
data_no_mv.loc[data_no_mv['FACTS CAUSING DISENGAGEMENT'].str.contains('|'.join(hardware_keywords), na=False, case=False)
, 'Problem class'] = "Hardware"
soft_hardware_keywords = ["timed out", "timeout", "too long", "incorrect detection", "Lost track", "Localization",
"geo-location related", "unsuccessful right turn", "unsuccessful left turn", "system",
"traffic conditions", "failed to detect an object correctly", " took longer than expected",
"main computer froze", "not braking correctly", "not speeding up correctly",
"not turning enough", "not slowing down enough", "didn't detect", "Sensor Fusion discrepancy",
"did not meet expectation"]
data_no_mv.loc[data_no_mv['FACTS CAUSING DISENGAGEMENT'].str.contains('|'.join(soft_hardware_keywords), na=False, case=False)
, 'Problem class'] = "Software/Hardware"
Traffic_ctrl_keywords = ["unstable target lane", "Traffic light error", "Stop sign error"]
data_no_mv.loc[data_no_mv['FACTS CAUSING DISENGAGEMENT'].str.contains('|'.join(Traffic_ctrl_keywords), na=False, case=False)
, 'Problem class'] = "Traffic control objects"
road_user_keywords = ["reckless driver", "behaving road user", "other road user"]
data_no_mv.loc[data_no_mv['FACTS CAUSING DISENGAGEMENT'].str.contains('|'.join(road_user_keywords), na=False, case=False)
, 'Problem class'] = "Other road user"
external_influences = ["obstruction", "encroachment", "occluded view", "surface conditions", "wheater"]
data_no_mv.loc[data_no_mv['FACTS CAUSING DISENGAGEMENT'].str.contains('|'.join(external_influences), na=False, case=False)
, 'Problem class'] = "External influences"
data_no_mv.loc[data_no_mv['Problem class'].isnull()
, 'Problem class'] = "Other problems"
data_no_mv.head()
Manufacturer | Permit Number | DATE | VIN NUMBER | OPERATING WITHOUT DRIVER | DRIVER PRESENT | DISENGAGEMENT INITIATED BY | DISENGAGEMENT LOCATION | FACTS CAUSING DISENGAGEMENT | Problem class | |
---|---|---|---|---|---|---|---|---|---|---|
0 | AImotive Inc. | AVT003 | 12.06.2018 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... | Traffic control objects |
1 | AImotive Inc. | AVT003 | 12.10.2018 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... | Traffic control objects |
2 | AImotive Inc. | AVT003 | 12.10.2018 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... | Traffic control objects |
3 | AImotive Inc. | AVT003 | 04.23.2019 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... | Other problems |
4 | AImotive Inc. | AVT003 | 05.14.2019 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver to the exit lane: risk of... | Software |
from collections import Counter
plt.title("Problems")
plt.xticks(rotation = 45)
plt.bar(dict(Counter(data_reduced_classified['Problem class'])).keys(), dict(Counter(data_reduced_classified['Problem class'])).values())
--------------------------------------------------------------------------- NameError Traceback (most recent call last) /var/folders/kh/bds3ggxd09gbnhkp6c414s9h0000gn/T/ipykernel_20061/3854068531.py in <module> 2 plt.title("Problems") 3 plt.xticks(rotation = 45) ----> 4 plt.bar(dict(Counter(data_reduced_classified['Problem class'])).keys(), dict(Counter(data_reduced_classified['Problem class'])).values()) NameError: name 'data_reduced_classified' is not defined
Klassifikationsmodell¶
data_preprocessed = data_no_mv.copy()
data_preprocessed['Problem class']=data_preprocessed['Problem class'].map(
{'Software':0,'Hardware':1,'Software/Hardware':2,
'Traffic control objects':3,'Other road user':4, 'External influences':5,
'Other problems':6})
data_preprocessed.head()
Manufacturer | Permit Number | DATE | VIN NUMBER | OPERATING WITHOUT DRIVER | DRIVER PRESENT | DISENGAGEMENT INITIATED BY | DISENGAGEMENT LOCATION | FACTS CAUSING DISENGAGEMENT | Problem class | |
---|---|---|---|---|---|---|---|---|---|---|
0 | AImotive Inc. | AVT003 | 12.06.2018 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... | 3 |
1 | AImotive Inc. | AVT003 | 12.10.2018 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... | 3 |
2 | AImotive Inc. | AVT003 | 12.10.2018 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... | 3 |
3 | AImotive Inc. | AVT003 | 04.23.2019 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver: risk of lane departure, ... | 6 |
4 | AImotive Inc. | AVT003 | 05.14.2019 | JTDKN3DU5A1092792 | No | Yes | Test Driver | Freeway | Lane change maneuver to the exit lane: risk of... | 0 |
data_preprocessed.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 8884 entries, 0 to 8884 Data columns (total 10 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Manufacturer 8884 non-null object 1 Permit Number 8884 non-null object 2 DATE 8884 non-null object 3 VIN NUMBER 8884 non-null object 4 OPERATING WITHOUT DRIVER 8884 non-null object 5 DRIVER PRESENT 8884 non-null object 6 DISENGAGEMENT INITIATED BY 8884 non-null object 7 DISENGAGEMENT LOCATION 8884 non-null object 8 FACTS CAUSING DISENGAGEMENT 8884 non-null object 9 Problem class 8884 non-null int64 dtypes: int64(1), object(9) memory usage: 1021.5+ KB
data_with_dummies = pd.get_dummies(data_preprocessed, drop_first=True)
data_with_dummies.head()
Problem class | Manufacturer_Apple Inc. | Manufacturer_Aurora Innovation, Inc. | Manufacturer_AutoX Technologies, Inc. | Manufacturer_BMW of North America | Manufacturer_Baidu USA LLC | Manufacturer_CRUISE LLC | Manufacturer_Drive.ai Inc | Manufacturer_Lyft | Manufacturer_Mercedes-Benz Research & Development North America, Inc. | ... | FACTS CAUSING DISENGAGEMENT_precautionary takeover to address planning, \nprecautionary takeover to address perception, \nother road user behaving poorly | FACTS CAUSING DISENGAGEMENT_precautionary takeover to address planning, \nprecautionary takeover to address perception, AV made unsuccessful left turn | FACTS CAUSING DISENGAGEMENT_precautionary takeover to address planning, \nprecautionary takeover to address perception, third party lane encroachment | FACTS CAUSING DISENGAGEMENT_precautionary takeover to address planning, \nprecautionary takeover to address perception, third party lane obstruction | FACTS CAUSING DISENGAGEMENT_precautionary takeover to address planning, AV lane change issues | FACTS CAUSING DISENGAGEMENT_precautionary takeover to address planning, AV made unsuccessful left turn | FACTS CAUSING DISENGAGEMENT_precautionary takeover to address planning, other road user behaving poorly | FACTS CAUSING DISENGAGEMENT_precautionary takeover to address planning, third party lane encroachment | FACTS CAUSING DISENGAGEMENT_precautionary takeover to address planning, third party lane obstruction | FACTS CAUSING DISENGAGEMENT_prediction discrepancy, a vehicle in the front was backing up, ego was not able to predict this behavior correctly. | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
5 rows × 4538 columns
data_with_dummies.isnull().sum()
Problem class 0 Manufacturer_Apple Inc. 0 Manufacturer_Aurora Innovation, Inc. 0 Manufacturer_AutoX Technologies, Inc. 0 Manufacturer_BMW of North America 0 .. FACTS CAUSING DISENGAGEMENT_precautionary takeover to address planning, AV made unsuccessful left turn 0 FACTS CAUSING DISENGAGEMENT_precautionary takeover to address planning, other road user behaving poorly 0 FACTS CAUSING DISENGAGEMENT_precautionary takeover to address planning, third party lane encroachment 0 FACTS CAUSING DISENGAGEMENT_precautionary takeover to address planning, third party lane obstruction 0 FACTS CAUSING DISENGAGEMENT_prediction discrepancy, a vehicle in the front was backing up, ego was not able to predict this behavior correctly. 0 Length: 4538, dtype: int64
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report
target = data_with_dummies['Problem class']
inputs = data_with_dummies.drop(['Problem class'],axis=1)
x_train, x_test, y_train, y_test = train_test_split(inputs, target, test_size=0.2, random_state=365)
sns.distplot((y_test),bins=50)
/Users/yassermurtada/opt/anaconda3/lib/python3.8/site-packages/seaborn/distributions.py:2619: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms). warnings.warn(msg, FutureWarning)
<AxesSubplot:xlabel='Problem class', ylabel='Density'>
K-Nearest Neighbors¶
from sklearn.neighbors import KNeighborsClassifier
error_rate = []
for i in range(1,10):
KNN_model = KNeighborsClassifier(n_neighbors=i)
KNN_model.fit(x_train,y_train)
KNN_prediction = KNN_model.predict(x_test)
error_rate.append(np.mean(KNN_prediction != y_test)) #nur Fehler berücksichtigen
plt.figure(figsize=(10,6))
plt.plot(range(1,10),error_rate,color='blue', linestyle='dashed', marker='o',
markerfacecolor='red', markersize=10)
plt.title('Error Rate vs. K Values')
plt.xlabel('K')
plt.ylabel('Error Rate')
Text(0, 0.5, 'Error Rate')
KNN_model = KNeighborsClassifier(n_neighbors=3)
KNN_model.fit(x_train, y_train)
KNeighborsClassifier(n_neighbors=3)
KNN_prediction = KNN_model.predict(x_test)
accuracy_score(KNN_prediction, y_test)
0.9712999437253799
print(classification_report(KNN_prediction, y_test))
precision recall f1-score support 0 0.98 0.97 0.97 538 1 0.82 0.82 0.82 17 2 0.97 0.97 0.97 530 3 0.00 0.00 0.00 0 4 0.83 0.74 0.78 27 5 0.70 1.00 0.82 7 6 0.98 0.99 0.98 658 accuracy 0.97 1777 macro avg 0.76 0.78 0.77 1777 weighted avg 0.97 0.97 0.97 1777
/Users/yassermurtada/opt/anaconda3/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1308: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result)) /Users/yassermurtada/opt/anaconda3/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1308: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result)) /Users/yassermurtada/opt/anaconda3/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1308: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
Entscheidungsbaum¶
from sklearn.tree import DecisionTreeClassifier
dtree = DecisionTreeClassifier()
dtree.fit(x_train,y_train)
DecisionTreeClassifier()
predictions = dtree.predict(x_test)
print(classification_report(y_test,predictions))
precision recall f1-score support 0 0.99 0.99 0.99 531 1 0.94 1.00 0.97 17 2 0.99 0.98 0.99 527 3 1.00 1.00 1.00 1 4 0.79 0.96 0.87 24 5 1.00 0.80 0.89 10 6 0.99 0.99 0.99 667 accuracy 0.99 1777 macro avg 0.96 0.96 0.96 1777 weighted avg 0.99 0.99 0.99 1777