## Polynomials (for the non data scientist)

*"Don't worry about your difficulties in Mathematics. I can assure you mine are still greater."*Albert Einstein

**The nth degree**

Whenever you talk to a data scientist there will be a point in the conversation when the data scientist will probably say, "polynomial to the nth degree."

The definitions you hear are similar to the definition

*Data*from

*Star Trek:*

*The Next Generation*would provide. Imagine Data saying quickly and without emotion,

*"The degree of a polynomial is the highest degree of its terms when the polynomial is expressed in its canonical form consisting of a linear combination of monomials."*I did not make up this definition. This is the real definition provided on Wikipedia.

**Purpose**

The purpose of a "polynomial" is to solve problems. Developing a model using a polynomial (some equation) is an attempt to draw a line through actual observations. It is akin to connecting the dots to see if some pattern pops out. Todays programming languages allow for complex polynomials to be developed using only a few lines of code. Just because complex polynomials to the nth degree can be derived does not mean they are effective and useful.

**Middle School Math**

Most everyone has forgot everything they ever learned about polynomials in middle school math. Yes, you studied polynomials (and some data science) in the 8th grade. You may even had thought or said, "I will never use this in real life."

*1st degree polynomial*is just a straight line also known as a linear equation. It is called

**line**ar because it is a straight line. The rate of change is the slope of the line and is constant.

*2nd degree polynomial*is a parabola. It only has one peak or valley. You used to factor these bad boys in middle school. The rate of change varies depending on x. The rate of change or the slope is not constant along a parabola and it can be negative or positive.

*3rd degree polynomial*A third degree polynomial will have two peaks and valleys. The 3rd degree is also a euphemism for inflicting physical or mental pain. The rate of change is not constant. The graph can change from negative to positive several times.

."

Why stop at three? Why not a fourth or fifth or sixth degree polynomial? Isn't a tenth degree polynomial more impressive than a 3rd degree polynomial or a pitiful 1 degree (straight line)?

Degrees in a polynomial are like giving directions to a person. The more degrees the more turns. The more turns the more likely a person is going to get lost. The same is true with a polynomial. With more degrees the slope changes from positive to negative, the more ups and downs in the graph. The graph can take a sudden upward spike then come back to normal.

By adding additional degrees to a polynomial you can get silly looking lines that curve and bend to fit each individual data point (see graphs below). The goal of hitting each data point is to minimize error and perhaps even eliminate error. This is also known as overfitting a line.

In the following two graphs, the blue dots are actual observations. In the first graph (Figure 1) the green line is a simple 2nd degree polynomial and the red dashed line is an impressive 10th degree polynomial. Both of these polynomials were "fit" to the observations. The errors of the green line are the distances between the blue dots and the green line. The error for the 10th degree polynomial is zero because it is forced through all the data points. There is no distance between the red dashed line and the blue dots. It should be evident the 10th degree polynomial is erratic and is not valid beyond the data points. It takes a nose dive after the last data point.

**Tweaking the Math or a**

**re more degrees better?**Why stop at three? Why not a fourth or fifth or sixth degree polynomial? Isn't a tenth degree polynomial more impressive than a 3rd degree polynomial or a pitiful 1 degree (straight line)?

Degrees in a polynomial are like giving directions to a person. The more degrees the more turns. The more turns the more likely a person is going to get lost. The same is true with a polynomial. With more degrees the slope changes from positive to negative, the more ups and downs in the graph. The graph can take a sudden upward spike then come back to normal.

**Nth degree (or playing with matches)**By adding additional degrees to a polynomial you can get silly looking lines that curve and bend to fit each individual data point (see graphs below). The goal of hitting each data point is to minimize error and perhaps even eliminate error. This is also known as overfitting a line.

In the following two graphs, the blue dots are actual observations. In the first graph (Figure 1) the green line is a simple 2nd degree polynomial and the red dashed line is an impressive 10th degree polynomial. Both of these polynomials were "fit" to the observations. The errors of the green line are the distances between the blue dots and the green line. The error for the 10th degree polynomial is zero because it is forced through all the data points. There is no distance between the red dashed line and the blue dots. It should be evident the 10th degree polynomial is erratic and is not valid beyond the data points. It takes a nose dive after the last data point.

In the second graph the green line is a pitiful one degree polynomial (a simple minded straight line) and the red dashed line is a 10th degree polynomial. Again, the error is the distances between the blue dots and the green line. It could be argued the straight line is a better model than the curvy red dashed line. This is true even though the amount of error for the straight line is greater than the curvy dashed line. The straight line appears to be a better predictor of the future than the 10th degree polynomial (red dashed line).
In both of these cases the polynomial of lesser degrees is going to be a better predictor than the polynomial of 10 degrees. In the notes section below, there is an example of a 30 degree polynomial compared to a straight line. |

**The goal is not to eliminate all error, but to build models for understanding and predicting****Weaknesses**

One of the biggest weaknesses is the models breakdown outside of the data range under investigation. This is true of any polynomial regardless of the number of degrees. Keep in mind, the goal is not to eliminate all error, but to build models for understanding and predicting.

I created this beautiful model (see graph below) a few weeks ago of home runs. The point of the model is to help understand steroid usage in Major League Baseball (MLB). A problem with my model and any polynomial to the nth degree is it does not work well beyond the data in question. In the graph below, I have extended the model beyond the year 2015. The headline of my analysis could be "2067 the year no home runs are hit in Major League Baseball." We all know this is nonsense. The model is useful in helping understand steroid usage in baseball for the period of time where there is data.

**To much of a good thing**

To paraphrase Ockham's razor,

*The simplest models tend to be the best models.*Models should be built with the fewest degrees possible. The more degrees, the more complex and the more difficult it is going to be for business units to understand and use any model. Business units, sales teams, or advertisers don't care about the model. What they care about is how the model can help them improve the business or solve a problem.

Models should be built to solve problems.Just because complex polynomials to the nth degree can be built does not mean they are useful. There is a tendency to gravitate and falsely conclude that the more complex a model the better the model. Nothing could be further from the truth. To paraphrase the

*Three Little Bears*, some models are too complex, some models too simple, and some models are just right.

**Notes:**

There is a Next Generation Star Trek episode called, The Nth Degree.

I thought about using this Star Trek quote instead of the Borg Queen quote. Spock about Khan,

*"He is intelligent, but not experienced. His pattern indicates 2 dimensional thinking. "*

Below is a 30 degree polynomial with zero error (red dashed line). On the right side of the graph, the 30 degree polynomial becomes very erratic it jumps up then takes a nose dive. Obviously the 30 degree polynomial is not going to be a very good predictor even though it has zero error.