Login
Login
National Data Archive
An Online Microdata Catalog
  • Home
  • Catalog
  • Citations
    Home / Central Data Catalog / CYNTHIALO_TITANIC
central

Titanic: Machine Learning from Disaster (Kaggle Competition)

cynthialo_titanic
cynthialo
Created on December 02, 2019 Last modified December 02, 2019 Page views 29 Metadata JSON
  • Project Description

Overview

Abstract
# titanic
Titanic: Machine Learning from Disaster (Kaggle Competition)

https://www.kaggle.com/c/titanic

This Kaggle competition asks users to "apply the tools of machine learning to predict which passengers survived the [sinking of the RMS Titanic] tragedy".

Data science involves three main steps:
* Data curation
* Data cleaning and integration
* Data analytics

The models in this repository include:
* Logistic regression: [genderclasslogisticregression.py](genderclasslogisticregression.py)

A simple classification model to get started, using the passengers' socio-economic status, sex, and fare as the features. Here, the model assumptions are:
* Passengers with incomplete data are removed
* No feature scaling is performed
* All features are assumed to be independent

The raw data can be downloaded directly from Kaggle.
Authoring entity
Agency Name Role
cynthialo owner
Language
English

Methods, software and scripts

Software
Name Libraries or packages used
Python
pandas, itertools, sklearn, matplotlib, numpy
License
Name
MIT License

Metadata production

Producers
Name Role
GitHub Bot bot
Date of Production
01 December 2019
National Data Archive

© National Data Archive, All Rights Reserved.