Access

Alternativna učenja

Android

Animacija

Antropologija

Apple - MAC OS X

Arheologija

Arhitektura

Astrologija

Astronomija

Audio kursevi + knjige

Audio, Multimedia, Video

Autobiografija

AutoCad, ArchiCAD, SolidWorks, Catia, Pro/Engineer

Automobili

Bajke

Baze podataka

Biografija

Biološke nauke

Botanika

C++ Visual C++ C#

Cloud

CSS

Dečije knjige

Delphi

Digitalna fotografija

Dizajn

Django

Domaće pripovetke

Domaći roman

Drama

E-knjiga

E-komerc

ECDL

Ekologija

Ekonomija

Elektrotehnika

Enciklopedija

Esejistika

Etika

Fantastika

Film

Filologija

Filozofija

Fizika

Fotografija

Geografija

Geologija

GOOGLE

Grafika, Dizajn, Štampa

Građevinarstvo

Hardver

Hemija

Hidrotehnika

Hobi

Horor

Humor

Internet

Intervju

Istorija

Istorija i teorija književnosti

Istorija umetnosti

Istorijski roman

Java, JavaScript, JScript, Perl

Joomla

jQuery

Knjiga posle posla - Beletristika i ostala izdanja

Knjižare i naše knjige

Književna kritika

Kuvari, hrana i piće

Leksikografija

Lingvistika

Ljubavni roman

logo

Magija

Marketing

Mašinsko učenje

Mašinstvo

Matematika

Medicina

Memoari

Menadžment

Modeliranje podataka

Monografija

Mreže

MS Office

Muzika

Nagrađivanje knjige

Naučna fantastika

Obrada teksta

OFFICE 2013

OpenOffice.org

Operativni sistemi

Oracle

Organizacione nauke

Pedagogija

PHP I MYSQL

Pisci u medijima

Ples

Poezija

Politika

Poljoprivreda

Popularna medicina

Popularna nauka

Popularna psihologija

Posao

Pozorište

Pravo

Pravoslavlje

Primenjene nauke

Pripovetke

Prirodne nauke

Priručnik

Programiranje

Psihologija

Publicistika

Putopis

Python programiranje

Raspberry PI

Rečnici

Religija

Robotika

Roman

Satira

Saveti

Sertifikati

Slikarstvo

Socijalna mreža - Facebook

Sociologija

Sport

Sport i hobi

SQL Server

Statistika

Strip

Tabele

Tableti

Tehnologija

Telekomunikacije

Triler

Turizam

Twitter

Udžbenici

Umetnost

Unix, Linux

Urbanizam

UX DIZAJN

Visual Basic .NET, VBA, V. Studio

Web design

Windows

Windows 7

Windows 8

WordPress

Zaštita i sigurnost

 

How to Analyze Tweet Sentiments with PHP Machine Learning

 

 

  • Twitter
  • Facebook
  • Google plus
  • Linkedin
  • Pinterest
  • Email

 

Pregleda (30 dana / ukupno): 16 / 282

 

php-machine-learning-and-tweet

As of late, it seems everyone and their proverbial grandma is talking about Machine Learning. Your social media feeds are inundated with posts about ML, Python, TensorFlow, Spark, Scala, Go and so on; and if you are anything like me, you might be wondering, what about PHP?

Yes, what about Machine Learning and PHP? Fortunately, someone was crazy enough not only to ask that question, but to also develop a generic machine learning library that we can use in our next project. In this post we are going take a look at PHP-ML – a machine learning library for PHP – and we’ll write a sentiment analysis class that we can later reuse for our own chat or tweet bot. The main goals of this post are:

  • Explore the general concepts around Machine learning and Sentiment Analysis
  • Review the capabilities and shortcomings of PHP-ML
  • Define the problem we are going to work on
  • Prove that trying to do Machine learning in PHP isn’t a completely crazy goal (optional)

What is Machine Learning?

Machine learning is a subset of Artificial Intelligence that focuses on giving “computers the ability to learn without being explicitly programmed”. This is achieved by using generic algorithms that can “learn” from a particular set of data.

For example, one common usage of machine learning is classification. Classification algorithms are used to put data into different groups or categories. Some examples of classification applications are:

  • Email spam filters
  • Market segmentation
  • Fraud detection

Machine learning is something of an umbrella term that covers many generic algorithms for different tasks, and there are two main algorithm types classified on how they learn – supervised learning and unsupervised learning.

Supervised Learning

In supervised learning, we train our algorithm using labelled data in the form of an input object (vector) and a desired output value; the algorithm analyzes the training data and produces what is referred to as an inferred function which we can apply to a new, unlabelled dataset.

For the remainder of this post we will focus on supervised learning, just because its easier to see and validate the relationship; keep in mind that both algorithms are equally important and interesting; one could argue that unsupervised is more useful because it precludes the labelled data requirements.

Unsupervised Learning

This type of learning on the other hand works with unlabelled data from the get-go. We don’t know the desired output values of the dataset and we are letting the algorithm draw inferences from datasets; unsupervised learning is especially handy when doing exploratory data analysis to find hidden patterns in the data.

PHP-ML

Meet PHP-ML, a library that claims to be a fresh approach to Machine Learning in PHP. The library implements algorithms, neural networks, and tools to do data pre-processing, cross validation, and feature extraction.

I’ll be the first to admit PHP is an unusual choice for machine learning, as the language’s strengths are not that well suited for Machine Learning applications. That said, not every machine learning application needs to process petabytes of data and do massive calculations – for simple applications, we should be able to get away with using PHP and PHP-ML.

The best use case that I can see for this library right now is the implementation of a classifier, be it something like a spam filter or even sentiment analysis. We are going to define a classification problem and build a solution step by step to see how we can use PHP-ML in our projects.

The Problem

To exemplify the process of implementing PHP-ML and adding some machine learning to our applications, I wanted to find a fun problem to tackle and what better way to showcase a classifier than building a tweet sentiment analysis class.

One of the key requirements needed to build successful machine learning projects is a decent starting dataset. Datasets are critical since they will allow us to train our classifier against already classified examples. As there has recently been significant noise in the media around airlines, what better dataset to use than tweets from customers to airlines?

Fortunately, a dataset of tweets is already available to us thanks to Kaggle.io. The Twitter US Airline Sentiment database can be downloaded from their site using this link

The Solution

Let’s begin by taking a look at the dataset we will be working on. The raw dataset has the following columns:

  • tweet_id
  • airline_sentiment
  • airline_sentiment_confidence
  • negativereason
  • negativereason_confidence
  • airline
  • airline_sentiment_gold
  • name
  • negativereason_gold
  • retweet_count
  • text
  • tweet_coord
  • tweet_created
  • tweet_location
  • user_timezone

READ ALL

 

Budite prvi koji će ostaviti komentar.

Ostavite komentar Ostavite komentar