Understanding How Charts Are Generated
This guide explains how the charts and tables in the Climate Indices for Pacific Sectoral Applications Portal are created. Understanding these details can help you interpret the information more effectively.
The portal allows you to select various options (like station, climate variable, dates, and chart type). Based on your choices, the system retrieves data and performs calculations to generate the visualisations you see.
General Process:
- Your Selections: You choose options in the "Data Selection" panel.
- Data Retrieval & Calculation: The portal fetches the necessary data or performs statistical calculations.
- Display: Your browser then shows the chart or table. Some final touches, like unit conversions or applying a running average, happen at this stage.
Understanding Common Data Selection Options
Many options in the "Data Selection" panel are common across different chart types. Here's what they mean:
Station(s)
This selects the weather station(s) for which data will be displayed. Some charts, like Line and Column charts, can compare multiple stations. Others, like Box Plots or Climate Summaries, work with a single station.
Time Series Period (Daily/Monthly)
This option determines the fundamental time step of the data used for some charts:
- Line and Column Charts: Directly uses either daily or monthly data values.
- Box Plot Charts: Calculates statistics based daily values. For example, if "Daily" is chosen as the Time Series Period and "Monthly" is chosen as the Reporting Period, a separate box plot for January will be generated for each year in your selected range (e.g., one for Jan 2000, one for Jan 2001, etc.), with each of these box plots being based on the daily values within that specific month.
- Seasonal Cycle Chart & Climate Summary Table: This setting is ignored for these chart types. They always use underlying daily data to calculate their monthly and annual statistics, regardless of this selection. The interface may disable this option or default it to 'Monthly' for these charts.
Data Source (Pre-Hom/Homogenised)
This selects the processing level of the data:
- Pre-homogenised (Pre-Hom): Data that has been checked for errors and inconsistencies.
- Homogenised: Data that has been statistically adjusted to account for non-climatic changes, such as station moves or instrument changes. This makes it more suitable for analysing long-term trends.
This choice affects which dataset is used for all chart types.
Variable
Selects the specific climate element to display or analyse, such as:
- Maximum Temperature (Tmax)
- Minimum Temperature (Tmin)
- Mean Temperature (Tmean)
- Diurnal Temperature Range (DTR - difference between Tmax and Tmin)
- Rainfall
- Mean Sea Level Pressure (MSLP)
- Climate Indices: A range of calculated indices representing various aspects of climate, such as temperature extremes (e.g., summer days, tropical nights), rainfall patterns (e.g., heavy rain days), and drought indicators.
This option is key for Line, Column, and Box Plot charts. It's ignored for Seasonal Cycle Charts and Climate Summary Tables, as they inherently show summaries for multiple variables (e.g., temperature and rainfall).
Reporting Period
This controls how time series data is grouped or aggregated for display or further calculation. The effect depends on the chart type.
- Monthly:
- Line/Column: Shows individual monthly data points.
- Box Plot: When "Time Series Period" is "Daily", this calculates a distinct box plot for each individual calendar month within each selected year (e.g., a box plot for January 2000, another for February 2000, then January 2001, etc.). Each box plot uses the daily data from that specific month. (Note: If "Time Series Period" is "Monthly", selecting "Monthly" for "Reporting Period" is not applicable for box plots).
- Annual:
- Line/Column: Averages monthly data to show an annual value.
- Box Plot: Calculates one box plot for each year.
- Custom:
- Line/Column/Box Plot: You select start and end months. Data is aggregated or statistics calculated for this custom period. If the period crosses a calendar year (e.g., September-January), it is labelled by the year the season starts in. For example, a season from September 2021 to January 2022 would be labelled as the '2021' season.
- Daily:
- Line/Column: Shows individual daily data points. Only available if "Time Series Period" is also "Daily".
- Box Plot: Not applicable.
- For Annual, and Custom periods, a data point (like an annual average or a seasonal box plot) will only be included if all the months needed to make up that period have valid data. For instance, an annual average for 2020 will only be shown if data is available for all 12 months of 2020. If any month is missing, that annual average for 2020 is excluded.
- For custom periods, the system also checks if the entire season falls within your selected "Year Range". If your Year Range cuts off part of a season, that particular seasonal point/box plot will be excluded. This ensures you only see data for complete seasons relative to your overall selected time frame.
Year Range (Start/End)
Defines the overall time window for all data and calculations.
Units (SI/Imperial)
Choose the unit system for displaying values (e.g., °C vs °F, mm vs inches). All calculations are done in SI units first, and then converted for display if you select Imperial.
Line / Column Chart
Purpose: These charts show trends in a single climate variable over time. They are useful for seeing patterns and changes, and can also compare trends between multiple stations. Line charts connect data points, making it easy to see the overall trend, while column charts use vertical bars, which can make it easier to compare the magnitude of individual period values.
Data Used: Both chart types display time series data points (date and value) based on your selections for Station(s), Time Series Period, Data Source, Variable, Reporting Period, and Year Range.
Running Average / Trend
This option applies a smoothing filter or a trend calculation to the time series data shown on Line and Column charts. It helps to highlight underlying patterns by reducing short-term fluctuations.
The choice of a period depends on the type of variability you want to smooth out:
- Shorter periods (e.g., 3, 5 years): Use these to smooth out minor, year-to-year fluctuations while still showing medium-term patterns and cycles.
- Longer periods (e.g., 11, 15 years): Use these to see the underlying long-term trend more clearly. By averaging over a larger window, more significant variability is smoothed over, leaving a much smoother line that highlights the overall direction of change.
- '0' (None): No running average or trend is applied. You see the original data (after any "Reporting Period" aggregation).
- Numerical values (3, 5, 7, ... 15): Calculates a "centered running mean". This means for each point, it averages values from a window of points around it (e.g., for a 5-period running mean, it averages the point itself, the 2 points before it, and the 2 points after it). This smooths out the line. The 'period' (e.g., 5 days, 5 months, or 5 years) depends on your "Time Series Period" and "Reporting Period" selections.
- 'T' (Trend Line): Calculates a linear trend line using a Theil-Sen estimation. This line shows the general direction of change (increasing, decreasing, or stable) over the selected period. 95% confidence intervals are shown with the trend line.
Important: The Sen's slope trend ('T') will only be calculated and displayed if the following conditions are met:- The "Reporting Period" must be set to 'Annual' or 'Custom'.
- The selected "Year Range" must span at least 20 years.
- At least 20 years with actual (non-null) data must be available within the selected period.
- At least 80% of the years within the selected "Year Range" must have valid data.
- 'A' (Average of Series): Calculates the simple arithmetic average of all the data points in your selected series and displays it as a flat horizontal line. This helps you see how individual points vary above or below the overall average.
Box Plot
Purpose: Shows the statistical distribution of a variable for specific periods (e.g., each month across many years, each year, or each season). It's great for understanding variability, including the minimum, maximum, and quartiles of the data. Box plots are only shown for a single station.
What a Box Plot Shows:
- The Box itself represents the middle 50% of the data.
- The bottom edge of the box is the 1st Quartile (Q1) or 25th percentile: 25% of data points are below this value.
- The line inside the box is the Median (or 2nd Quartile, Q2, 50th percentile): The middle value of the dataset; 50% of data points are below this value.
- The top edge of the box is the 3rd Quartile (Q3) or 75th percentile: 75% of data points are below this value.
- The Interquartile Range (IQR) is the height of the box (Q3 - Q1).
- The Whiskers (lines extending from the top and bottom of the box) show the full range of the data. The lower whisker extends to the minimum value in the dataset for that period, and the upper whisker extends to the maximum value.
Data Used: For each selected station, statistical summaries (like median, quartiles, whisker limits) are calculated for each individual instance of the chosen "Reporting Period" (e.g., for January 2000, then February 2000, then January 2001 if "Monthly" is chosen). These calculations are based on the underlying "Time Series Period" (Daily or Monthly data) for that specific instance. Only data from seasons or periods fully contained within your selected "Year Range" are included.
Seasonal Cycle Chart
Purpose: Visualises the typical annual cycle of temperature (average high and low) and rainfall (average total) for a single station. It's based on long-term monthly averages and can be displayed in two formats:
- Polar Chart: This circular chart provides an intuitive year-round view, excellent for showing the cyclical nature of temperature and rainfall.
- Bar and Line Chart: This standard chart displays rainfall as bars and temperature as lines, which can be easier for comparing precise monthly values.
Data Used: This chart uses long-term monthly averages. These averages are always calculated from the underlying daily data for your selected "Data Source" (Pre-Hom or Homogenised) and "Year Range".
Selections for "Variable", "Time Series Period", and "Reporting Period" are ignored for this chart type, as it has a fixed way of presenting temperature and rainfall cycles.
The monthly averages for temperature and rainfall shown in this chart are based on pre-calculated monthly data. If, for a particular calendar month (e.g., January), this pre-calculated data is missing for some years within your selected period, the long-term average for that month will be based on fewer years of data. The portal will display a warning message if this occurs, listing the specific months, variables (Average High Temperature, Average Low Temperature, Average Rainfall), and the years for which the pre-calculated monthly data was missing. This helps you understand the reliability of the averages shown.
Climate Summary Table (Heatmap)
Purpose: Displays key monthly and annual climate statistics, such as long-term averages and record values, for a single station. These statistics are calculated across the entire period defined by your selected "Year Range". Colours (a heatmap) are used to highlight high or low values.
Understanding the Colours (Heatmap)
The colours in the table are relative to the data for the selected station and time period. This means the colours for one station are not directly comparable to another unless their data ranges happen to be the same.
- Temperature Scale: The colours transition from blue (representing the lowest temperatures in the dataset for that station) through yellow (mid-range) to red (representing the highest temperatures).
- Rainfall Scale: The colours transition from a very light green/white (for the lowest monthly rainfall values) to a dark green (for the highest monthly rainfall values). The annual total rainfall value is displayed but does not influence the colour scale.
Statistics Shown Include:
- Mean Max Air Temp for each month and for the year.
- Mean Air Temp for each month and for the year.
- Mean Min Air Temp for each month and for the year.
- Mean Total Rainfall for each month and for the year.
Data Used: The statistics in this table are generated using pre-calculated monthly data to provide a comprehensive summary. All average values (e.g., Average High Temperature, Average Rainfall) are calculated from this monthly data.
Selections for "Variable", "Time Series Period", and "Reporting Period" are ignored.
The monthly average statistics (Mean Max Air Temp, Mean Air Temp, Mean Min Air Temp, and Mean Total Rainfall) displayed in this table are derived from pre-calculated monthly data. If this pre-calculated data is missing for certain years for a given calendar month and variable, the long-term average for that month/variable will be based on fewer years of data. The portal will display a warning message if this occurs, detailing which months, variables, and years had missing pre-calculated monthly data. This helps you understand the reliability of the averages shown.
