วันจันทร์ที่ 2 สิงหาคม พ.ศ. 2553

Machine learning data - logistic regression with adjustment L2 Python


Image : http://www.flickr.com


logistic regression

Logistic regression is used for binary classification problems - where you have some examples of "on" and other examples that are "outside". It takes as input a career that said, some examples of each class with a label, where each example is "on" or "off." The goal is to learn a model from training data, so that the caption of the new examples that you have not seen and can not predictKnowing the label.

For an example: Suppose you have a lot of data of buildings and earthquakes (for example, the year the building was constructed to describe the type of material used, the strength of the earthquake, etc.), and you know where every building collapsed (ON) or not ("off") in each of the last earthquake. Using these data, you want to make predictions about whether a particular building will collapse in a hypothetical future earthquake.

One of the first models, which would be worthattempts, logistic regression.

Encodes the

He was not working on this exact problem, but I had to quit a job. In practice what they preach, I started looking for a dead simple Python class logistic regression. The only requirement is that I wanted to support the legalization L2 (more on that later). They are also code-share with a group of other people on many platforms, so I wanted as few dependenciesexternal libraries as possible.

I have not found exactly what I wanted, so I decided to take a walk in the past and I use. I've written in C + + and Matlab, but never before in Python.

I'm not the discharge, but there are many good explanations out there to follow if not a bit afraid of calculation '. Just do a little 'Googling for "derivation of logistic regression." The idea is to write the probability of dataas some internal settings of the parameter, the derivative, which show how to modify the internal parameters to make the data more likely. Got it? Good.

For those of you out there who know, inside and outside of logistic regression, see how short the train () method. I like how easy it is to do in Python.

Regularization

I caught a bit 'indirect flak speak during March Madness season, asI settled into my latent carriers of the matrix factorization model of team offensive and defensive strengths in predicting the outcome of NCAA basketball. Apparently, people thought I was stupid - crazy, right?

But seriously, people - legalization is a good idea.

I would home the point. Check out the results of running the code (below) is connected.

Take a look at the top row.

On the left is set training. There are25 examples from the x-axis position and the y-axis indicates whether the sample is "on" (1) or "off" (0). For each of these examples, there is a vector that describes its attributes, which I understand. After training the model, ask the training model that is developed to bypass the labels and the likelihood that each label is "on" only on the basis of the description and examples of carriers has learned that the model (estimate we hope things like strongest earthquake and old buildingsincrease the probability of collapse). Chances are shown red Xs. Top left, the red X on the right are the top of blue dots, it is very safe on the labels of the examples, and that is always correct.

Now, on the right side we have some new examples that the model has never seen before. This is called the test in September This is essentially the same as the left, but knows nothing of the test model of class labels (yellow dots). WhatYou see, there is still a decent job of providing the label, but there are some cases where it is worrying very confident and very wrong. This is known as overfitting.

This is where regularization a. While walking between the lines about, we will be stronger L2 regularization - or, equivalently, pressure on the internal parameters to zero. This has the effect of reducing the model of certainty. Just because it's perfectly reconstruct training setdoes not mean that you have discovered everything. You can imagine that if you rely on this model to make critical decisions, it would be desirable to have at least a little 'there in the regularization.

And here is the code. Seems long, but most of it is to generate the data, then the result. The bulk of the work is done by train () method, which only three (thick) lines. It requires NumPy, SciPy and pylab.

* For full disclosure, II admit that generates random data in order so that it is vulnerable to overfitting, logistic regression, without looking at regularization perhaps worst of them.

Python code

Importing scipy.optimize.optimize fmin_cg, fmin_bfgs, fmin

Import NumPy as NP

final sigma (x):

Return 1.0 / (1.0 + np.exp (-x))

Class SyntheticClassifierData ():

def __init__ (self,N, D)

"" "Create instances of input vectors and N d-dimensional 1D

Class labels (-1 or 1). ""

Mean = 0.05 np.random.randn * (2, d)

np.zeros self.X_train = ((Nd))

np.zeros self.Y_train = (N)

for i in range (N):

if np.random.random ()> 0.5

y = 1

Other:

y = 0

self.X_train [i:] = np.random.random (d) + y medium [:]

self.Y_train [i] = 2.0 * y - 1

self.X_test np.zeros = ((Nd))

self.Y_test np.zeros = (N)

for i in range (N):

if np.random.randn ()> 0.5

y = 1

Other:

y = 0

self.X_test [i:] = np.random.random (d) + y medium [:]

self.Y_test [i] = 2.0 * y - 1

Class LogisticRegression ():

"" "A simple logistic regression models L2 regularization(Zero-mean

priori Gaussian parameters). ""

def __init__ (self, x_train = None, y_train = None, x_test = None, y_test = None,

alpha =. 1, summary = False):

# Set the strength of regularization L2

self.alpha =Alpha

# Set the data.

self.set_data (x_train, y_train, x_test, y_test)

# Initialize the parameters to zero in the absence of a better choice.

self.betas np.zeros = (self.x_train.shape [1])

DEFnegative_lik (self, beta):

return -1 * self.lik (beta)

def lik (self, beta):

"Probability" of data according to current settings of the parameters. "

# A probability

L = 0

forself.n in range ():

+ L = log (sigma (self.y_train [i] *

np.dot (Beta self.x_train [i ,:])))

# BeforeProbability

for k in range (1, self.x_train.shape [1]):

L -= (self.alpha / 2.0) * self.betas [k] ** 2

Back to the

def train (self):

"" "Define the slope and hand out a SciPyGradient-based

Optimizer. ""

# Definition of the derivative of probability than beta_k.

# It is necessary to multiply by -1, because there will be minimal.

dB_k lambda = B, K: np.sum ([* [B-self.alpha k]+

self.y_train [i] * self.x_train [i, k] *

Sigma (-self.y_train [i] *

np.dot (Bself.x_train [i ,:]))

for i in (self.n range)]) * -1

# The full course is a series of derivatives componentwise

dB = lambda B: Np.array ([dB_k (B, K)

for k in range (self.x_train.shape [1])])

Optimize #

self.betas fmin_bfgs = (self.negative_lik, self.betas,fprime = dB)

final set_data (self, x_train, y_train, x_test y_test):

"" Take the data that has already generated. " ""

self.x_train = x_train

self.y_train = y_train

self.x_test = x_test

self.y_test = y_test

y_train.shape self.n = [0]

final training_reconstruction (self):

p_y1 np.zeros = (self.n)

for i in range (self.n):

[I] = p_y1 sigmoid (np.dot (self.betas,self.x_train [i ,:]))

Back p_y1

test_predictions DEF (self):

p_y1 np.zeros = (self.n)

for i in range (self.n):

[I] = p_y1 sigmoid (np.dot (self.betas,self.x_test [i ,:]))

Back p_y1

plot_training_reconstruction final (self):

plot (np.arange (self.n) self.y_train, 0.5 + 0.5 * "Bo")

plot (np.arange (self.n) self.training_reconstruction ()'RX')

ylim ([-. 1, 1.1])

plot_test_predictions DEF (self):

plot (np.arange (self.n) 0.5 + 0.5 * self.y_test, 'yo')

plot (np.arange (self.n) self.test_predictions (), 'RX')

ylim ([-. 1, 1.1])

if __name__ == "__main__":

Import pylab*

# 20 dimensional data set to create with 25 points - this is

# Sensitive to overfitting.

data SyntheticClassifierData = (25, 20)

# Run for a variety of strengths regularization

Alpha = [0, .001, .01, 0.1]

for j, a in enumerate (Alpha):

# Create a newLearners, but use the same data for each cycle

LR = LogisticRegression (= x_train data.Y_train data.X_train, y_train =

x_test = data.X_test,y_test = data.Y_test,

alpha = a)

print "initial probability:

printlr.lik (lr.betas)

# Train model

(Lr.train)

# Display version more

print "Final Beta"

Print lr.betas

print "Final lik"

Print lr.lik (lr.betas)

# Plot the results

Subplot (len (alpha), 2, 2 * j + 1)

lr.plot_training_reconstruction ()

ylabel ('alpha ='% s% a)

if j== 0:

Title ("reconstructions Student Set)

Subplot (len (alpha), 2, 2 * j + 2)

lr.plot_test_predictions ()

if j == 0:

Title ("Test SetForecast)

show ()

ไม่มีความคิดเห็น:

แสดงความคิดเห็น