# Data Analysis by Finding Patterns and by Graphing

###### I.O.U. — This page is under construction.  The first part (until the next IOU) is fairly coherent, but certainly isn't in final form, and the ending needs to be developed and revised.  This isn't my highest priority, so it probably won't happen until Fall 2013.

A simple instructional activity illustrates basic principles of data analysis:  Imagine that we have data from a set of related physical experiments, with two variables (x, y) that are mathematically related by a theory;  all experimental systems are identical, except that we control the value of x so it's different in each system, then we observe the value of y.  Here is one possible data set:

In experiment #1, x = 3 and y = 9.  Ask students to generate three theories, with each proposing a simple mathematical relationship between x and y;  of course, they also can propose additional theories that are more complex.

For experiment #2, with x = 5, ask them to make a prediction (when x = 5, y =   ) using each theory.  Then tell them that, in a physical experiment with the second system, the real observation was "x = 5, y = 15".  For each of their original theories, ask them to compare their theory-based predictions with reality-based observations, in a Reality Check that helps them evaluate each theory.  Then all of you can discuss their experiences, and develop the concepts of plausible competitive theories (after experiment #1) before a crucial experiment (#2) lets them distinguish between these theories, so only one theory retains a high evaluative status.

This simple two-experiment activity can help students understand some basic principles of Science Process, and some of its terms (italicized), in the context of their problem-solving experiences.

The activity could be extended by letting students generate a set of data for each of their theories, for experiments with x = 3, 5, 7, 9, 11, and put each data set into a table.  Then they can graph each set of data, either manually with pencil-and-paper or (after they've made manual graphs) with a computer program, so they can see what each theory “looks like” when it's represented in graphical form.

By comparing their two approaches to analyzing data — first when they mentally searched for patterns in experiment #1, and later when they graphed each of their data sets — students can recognize that the two approaches are related strategies for achieving similar results when they are designing a theory by generating-and-evaluating theories.

The graphs they make can answer questions about some theories about mathematical theories:  are x and y related directly, as in (y = mx) or (y = mx + b)?  or is a "squaring" involved, as in (y = xx)?   Other graphs can help answer other questions:

In addition to these graphs, they can use graphs to test a wider range of theories about mathematical relationships between x and y: (described below at *)

###### I.O.U. — Later, the "scraps" below might be "written up coherently" and used in this page;  and the entire page will be revised.

possible title for a section — Graphical Representations of Theory-Based Models

In the activity above, each graph-point is an experiment, so interpreting a graph is similar (in some ways) to interpreting a series of experiments, as examined in Section 5 of my details-page.

data sets in which variables have the relationships used in this activity (direct and squared) naturally arise from lab-experiments with physics of motion, so these could be used to make a connection with real-life phenomena;

in a computerized version of this activity (not available now) the program could show an "interesting animation" to represent the process of running a new physical experiment with a new experimental system (in which x has been changed from 3 to 5 to...) to make it more interesting for students using the computer program;

when students make tables this can help them with pattern-finding, because in a table all numbers are visible at same time;

* possible types of graphs:  direct (y = mx, y = mx + b), inverse (xy = k);  plus logarithmic or exponential (y = log x, y = ln x) (y = 10x, y = ex) after students understand logs, as explained here.

teachers can show students how to make graphs — by hand or with a computer — so they get a linear line;  students can do this manually (once or twice) by a teacher providing a data set to illustrate log-relationships, etc, and students do the math using a calculator and putting the results into a table;  then, once they understand what a computer program is doing, they can just use a computer.

but it might be better for students if they can recognize "the shapes" for different math-relationships, and then use a computer program to distinguish between relationships that produce similar shapes;  the shape can be linear (y = mx, y = mx + b);  curving upward with increasing slope (for y=xx, y=ex);  curving upward with decreasing slope (y = square root of x, y =  X , plus some log-functions);  a different-looking shape for xy = k);  and then use Excel as "tiebreaker" similar-looking shapes (like y=xx, y=ex, y=10x);  I.O.U. - eventually I can include graphs that "show these shapes" in this section, to supplement these descriptions using words and equations.

maybe students can use log paper? log-log paper?  probably not for most students, who should just (except for advanced students?) use computerized graphing for log-relationships?

an interesting type of experiment (or series of experiments) uses calibration curves;  this is described here where Paragraph 1e describes two useful skills (over-and-down on a graph, and "y = mx + b" by substituting-and-solving), with students doing both and comparing the results;  they should also be able to calculate a slope "m" manually from a graph (which they could draw, or print from a computer-generated graph) by choosing two points (initial & final) and finding values for x and y (both initial & final) and substituting numbers into the slope-equation, (Yfinal - Yinitial / Xfinal - Xinitial), and comparing their values with the value of "m" generated in the computer's equation for "y = mx + b")