METR 136: Statistics (San Jose State University)
Research and Class Projects
Project 1: Temperatures and Decadal-Averaged Temperatures for Berkeley, CA
Above is a temperature plot of data collected from Berkeley, California over the course of over 100 years. There are many peaks and drops in the figure, but the highest temperatures do tend to occur near the end of the period, and the decadal averages (the red dots) depict a gradual increase over time.
Project 2: Oakland and San Francisco Histograms
These two plots are histograms of temperature, dew point, and precipitation that occurred in Oakland and San Franscisco between 1990 and 2015. There was a time period in Oakland where no data was collected; hence the blank spot around 2000. While the cities are quite similar in temperature and dew point amounts, San Franscisco has acquired more overall precipitation, though it is still on the lower side (never higher than 5 inches).
Project 3: Annual Maximum and Minimum Temperatures for Multiple California Cities
The first two figures show annual maximum and minimum temperatures for the six listed California cities. In the third graph, the topmost data point represents the average maximum for the station while the lowest data point is the minimum average temperature. Geographical terrain is the most likely cause of the variation between maximum and minimum averages. For instance, Lake Spaulding has the least variation because the water causes the air temperature to heat much slower, keeping temperatures cooler and not much above 60 °F. Fresno, though, is an area of usually small moisture amounts, making them more susceptible to high temperatures. The same pattern shows with Eureka and Tustin Ranch, since they are both near water, but there may be more variation between high and low temperatures from days wind flows from inland, causing warmer temperatures.
This can be proved from the standard deviation (see below), which turns out to be highest with the Tustin Ranch minimum average. Fresno also has a large deviation in its minimum average. Since Fresno is located in the central valley, its least likely to be affected by winds. Therefore, this variation in temperature is probably from clouds.
Computed Standard Deviations:
Berkeley max: 1.3371
Berkley min: 1.1558
Eureka max: 1.4832
Eureka min: 1.0203
Fresno max: 1.0713
Fresno min: 1.7387
Indio max: 1.2932
Indio min: 1.3033
Lake Spaulding max: 1.5312
Lake Spaulding min: 1.3033
Tustin Ranch max: 1.3033
Tustin Ranch min: 1.8156
Project 4: San Jose, CA Temperatures and Precipitation: Empirical Probability
Here is the resulting plot when calculated empirical probabilities for precipitation and maximum temperatures, and the conditional probability of both of these occurring at the same time. It is clear when analyzing precipitation the that probability of rain is much higher at the beginning of the period. This can be attributed to the fact that the probability decreases as more years, hence more events, are added. For this same reason, the values are closer to each other than at the beginning which is why there's an evident spike in 1975. The maximum temperature being over 60 degrees Fahrenheit happened less times in the data, therefore the probability was smaller (which makes sense since this takes place in January). That is why the proabaility of both max temps and precipitation occurring at the same time is near zero, because just getting temperatures over 60 degrees was hard enough.