TransWikia.com

Logbook: Machine Learning approaches

Data Science Asked by Jorge on June 21, 2021

In the past, when trying different machine learning algorithms in order to solve a problem, I used to write drown the set of approaches on a notebook, keeping details such as features, feature preprocessing, normalization, algorithms, algorithm parameters… therefore, building a hand-written logbook.

However, currently I’m concerned about using a ‘more professional’ tool, so that I can keep more details and even share it with other team-members, who are also able to stamp their approaches.

It would be great an automated and collaborative tool that keep track of the work done, considering details like: features, algorithms, algorithms parameter, data pre-process, data, metrics… beyond a collaborative Google Drive Spreadsheet for instance.

How are you solving this? How are you keeping track of the work done? What’s your logbook tool?

Thank you very much in advance.

3 Answers

How are you solving this? How are you keeping track of the work done? What's your logbook tool?

This might not be the best approach. But, this is how my team does it. We believe that for pulling off an end-to-end data science experiment, proper conscience is very important. So, we use Slack for the same for our discussions and the meetings.

In addition to them, we have Rmd (R markdown) files for documenting the planning and the analysis parts.

Correct answer by Dawny33 on June 21, 2021

Check this out, looks like exactly what you need

Answered by Diego on June 21, 2021

How are you solving this? How are you keeping track of the work done? What's your logbook tool?

For my bachelors thesis (write-math.com) I wrote my own little toolkit to go through different models / preprocessing steps very fast. Each experiment had one configuration file (see hwr-experiments repository). For example:

data-source: feature-files/baseline-3-points
training: '{{nntoolkit}} train --epochs 1000 --learning-rate 0.1 --momentum 0.1 --print-errors --hook=''!detl
    test {{testing}},err=testresult_%e.txt'' {{training}} {{validation}}
    {{testing}} < {{src_model}} > {{target_model}} 2>> {{target_model}}.log'
model:
    type: mlp
    topology: 24:500:369

The trained model is stored; it is pretty fast to get the evaluation results (e.g. accuracy, confuscation matrix).

Answered by Martin Thoma on June 21, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP