Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
September 9, 2021 09:04 pm GMT

Light Bulb : Machine learning made easy

Light Bulb is a tool to help you label, train, test and deploy machine learning models without any coding.

Go directly to the Github project here.

Lets say you want to build a photo-sharing app called SnapCat, that only allows users to send pictures of cats, and nothing else.

"Snapcat"Snapcat

How would you go about starting this? Itll probably looks something like this:

  1. Collect a large set of cat and not cat photos.
  2. Manually label the posts as cat or not cat.
  3. Split the dataset into train, test, and validation sets.
  4. Train some model (lets say a convolutional neural network) on the dataset.
  5. Look at the accuracy on the test set, if its not good enough, then go back and rethink each step.
  6. Save the model weights, and load them into some web backend, to start classifying new posts.

Lets see how Light Bulb can help you with all of this.

Introducing: Light Bulb

Light Bulb is a service that will integrate this end to end, from labeling, all the way through to production. First define this config:

# config/cat_not_cat.ymltask:  title: Is this a cat?dataset:  directory: dataset/cat_not_cat/  data_type: images  judgements_file: outputs/cat_not_cat/labels.csvlabel:  type: classification  classes:    - Cat    - Not Catmodel:  directory: outputs/cat_not_cat/models/user: chris

Then from the root of the app run:

makemake dataset/cat_not_cat.virt/bin/python code/server.py --config config/cat_not_cat.yml

Which will start a server on http://localhost:5000

Start LabelingStart Labeling

As you label a few entries, youll see the Training icon change from No to Yes. This means that a model is actively training on the newly labeled posts. As you label more posts, the model gets smarter.

Alt TextLight Bulb is getting smarter!

After labeling more images (and giving the model some time to train), youll see an Accuracy statistic, that shows you how well the model is doing. In this case our model is getting about 87% accuracy, which is pretty amazing since we only labeled 78 images.

Alt Text

Labeling 2000 images in 15 minutes

Now that our model is trained, itll start helping us label more data . Light Bulb will:

  • Go through the dataset, and labeling images that it feels fairly confident about (97% confident in fact).
  • Store the labels the model assigns and present them to you in batches. All you have do is confirm the labels are correct.

Verifying the automatically labeled imagesVerifying the automatically labeled images

With this batch labeling feature, I managed to label all 2000 images in just 15 minutes.

Model Serving

Light Bulb also exposes an API for your model. You can easily issue API requests to the server, and score new images. Lets see what our model thinks about this image.

Alt Text

curl --header "Content-Type: application/json" \     --request POST \     --data '{"type": "images","urls": ["https://github.com/czhu12/light-bulb/raw/master/docs/images/cat-image-1.jpg"]}' \ http://localhost:5000/score

Which returns:

{  "labels": [    "Cat",    "Not Cat"  ],  "scores": [    [      0.9971857666969299, # Our model thinks its 99% a cat!      0.0028141846414655447    ]  ]}

And now lets try something thats not a cat:

Alt Text

curl --header "Content-Type: application/json" \     --request POST \     --data '{"type": "images","urls": ["https://raw.githubusercontent.com/czhu12/light-bulb/master/docs/images/not-cat-image-1.jpg"]}' \ http://localhost:5000/score

Our model returns:

{  "labels": [    "Cat",    "Not Cat"  ],  "scores": [    [      0.007293896283954382,      0.9927061200141907 # Our model thinks this is 99.2% not a cat!    ]  ]}

How does it work?

Encoder-Decoder

Most deep learning tasks can be framed as an encoder decoder architecture.

Images

For all image tasks we use a Squeeze Net encoder pre-trained on Image Net.

Image Classification: Image classification is done with a CNN based encoder, that is fed into a multi-layer perceptron decoder.

Image ClassificationImage Classification

Object Detection (work in progress): Object detection can be framed as a CNN based encoder, that is fed into a regression decoder. For object detection, the decoder will be a pre-trained YOLOv3 head.

Object detection exampleObject detection example

Text

For all text tasks, we use a 3 layer LSTM encoder pre-trained as a language model on Wikitext-103.
Text Classification: Text classification can be framed as an LSTM encoder that outputs into a logistic regression decoder.

Sentiment classification is a classic text classification problemSentiment classification is a classic text classification problem

Sequence Tagging (work in progress): Sequence tagging can be framed as an LSTM encoder, where at each time-step, the output is fed into a CRF model.

Named entity recognition is a sequence tagging problem.Named entity recognition is a sequence tagging problem.

Secret Sauce

Light Bulb uses a few tricks to train a model as fast as efficiently as possible.

Active Learning

Alt Text

When Light Bulb decides which post to show you to label next, it chooses based on a process known as maximum entropy sampling. Before Light Bulb shows you an image to label, itll first try to make a prediction. Let's say one image was scored as 95% cat, 5% not cat, and another image was scored as 50% cat, 50% not cat. Which one should Light Bulb show you next? Intuitively, the second image should be labelled next, since that seems to be the one the model isnt quite sure about. This way, we dont waste any of your labels!

Pre-training

Light Bulb will leverage state of the art semi supervised learning, and pre-training. One of the reasons why deep learning is so powerful, is because of its unique ability to transfer knowledge from one task to another (see: word vectors, transfer learning). For instance, a model that is good at predicting the next word in a sentence, ie: a language model, will also be good at classifying the sentiment of the sentence. Likewise, a model that is good at the ImageNet dataset, will also be good at predicting cat vs not cat, with small tweaks of course.

Semi Supervised Learning

Light Bulb also leverages semi supervised learning techniques to learn as much as possible from the dataset you provide. In the cat vs not cat dataset above, there were around 2000 images, but we only labeled about 80 of them. But that doesnt mean that we cant learn from the other 1920 images! Light Bulb uses all the images in the dataset, even if you havent labeled them, to train the model, by fine-tuning an auto-encoder, which will essentially learn the general properties of all the 2000 images.

For full details check out the Github project.

Any questions, suggestions, bugs, or just want to reach out, PM me on Twitter!

Happy Labeling!


Original Link: https://dev.to/mage_ai/light-bulb-machine-learning-made-easy-ink

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To