Unveiling Speed: An Analysis of ORC Certified Sailboats

3rd October 2023 | T. Großkopf

The Goal

Sailboats, with their elegant designs and effortless speeds, have always fascinated enthusiasts. But what truly determines a boat's speed? In this article, we'll delve deep into the world of ORC-certified sailboats to understand what are the key parameters influencing their speed.
We use this information to improve our estimates of boatspeed in our sailboats section and our calculator tool.

The Offshore Racing Congress (ORC)

The Offshore Racing Congress (ORC, orc.org) is the world's leading provider of boat ratings for competition handicapping. At the heart of the rating system is an elaborate speed prediction, based on detailed measurements of the boat. The collected data are entered into a velocity prediction programme (VPP), which then calculates boat speeds at different angles of attack and wind speeds.
Velocity made good (VMG) is the speed at which a sailboat is approaching its destination. It is really helpful to know the values of VMG under various wind conditions, in order to estimate the potential of a sailboat. It can also be used to estimate the distance covered over time for race and route planning.
Beat VMG is the velocity going upwind. Since a sailboat cannot directly sail into the wind, it has to tack in order to keep approaching its target. The angle a boat can sail against the wind, as well as the speed it can maintain at that angle, will determine the beat VMG.
Run VMG, on the other hand, is the fastest velocity a boat can achieve directly downwind. Most boats sail faster at a slight angle to the wind, say 120° or 150°, rather than sailing directly at 180° downwind.

The ORC measurements homepage. For a precise VPP prediction, a variety of measurements must be performed to generate a detailed model representation of the boat.

Collecting Data

A boat that is measured by the ORC receives an individual certificate with specific ratings for all tested conditions. For us, the interesting thing is that all active certificates are openly published on the ORC website under active certificates.
Even more conveniently, there is a fantastic GitHub repository by Jan Pieter Waagmeester who collects all current ORC certificates in his repository github.com/jieter/orc-data. The repository is under an MIT licence, so any use of the data for advanced analysis and machine learning is permitted.
Looking at the dataset, it contains 13,305 boats with basic hull measurements, sail areas, and outputs from the velocity prediction programme (VPP). The number of unique boat types is 4,911. It should be noted that due to spelling errors or differences in nomenclature, the number of actual unique boats will be smaller. For our type of processing, this isn't a concern, so we won't delve deeper here. To get a feel for the data, we can aggregate the boats by boat builder. Displayed here are the top 20 builders with the most boats represented in the dataset.

Top 20 sailboat constructors represented in the ORC dataset.

Dataset Basic Analysis

The boat length overall (including appendages) of sailboats in the dataset ranges from 5 to 40.6 m, with a mean value of 10.9 m.

Histogram of 13,305 sailboats aggregated by boat length overall in meters.

The most crucial part of a sailboat responsible for its performance is its sails. The dataset contains mainsail, genoa (jib), spinnaker, and asymmetric-spinnaker sail areas. Even the same boat model can have different sail areas. This variability is visualised in the next figure, where the 20 most common sailboats are displayed in a boxplot. The colour shows the average beat VMG for each boat type at a wind speed of 6 knots (2 Beaufort).

Boxplot of the 20 most frequently represented boat types and their sail areas — Boxplot of the 20 most frequently represented boat types. The boxes show the median sail area, the 25% and 75% intervals, the range, and the outliers. The left figure displays the mainsail area, while the right one showcases the foresail (genoa/jib).

If you'd like to further explore the dataset, there's a really engaging interactive inspection tool developed by Jan Pieter, allowing you to display polar plots for all ORC-certified boats.

Performance

Now, let's delve deep and see what the most crucial variable correlated with boat speed is. To simplify our question, we'll focus on beat VMG at 6 knots of wind (light wind conditions). Initially, longer boats should be able to sail faster, since the boat's waterline length determines its hull speed. The longer a boat, the faster its hull speed.

Scatterplot of length overall vs beat VMG — Scatterplot of 13,305 sailboats showing their length overall in meters and the upwind speed (beat velocity made good, Knots) the boats can achieve at 6 knots wind speed.

As we can deduce from the scatterplot above, there's a correlation between boat length and beat VMG in light winds. Other boat features, such as sail area, may also influence this. But before we examine each potential boat feature in the dataset, we'll fit a multiple linear regression model to the data and identify the most significant features as predicted by the model. Using scikit-learn, a machine learning library for Python, we can develop a linear model that reasonably predicts boat speed based on its features. The model identifies the most crucial features as draft, main sail area, and length.

Scatterplot of draft vs beat VMG — Scatterplot of 13,305 sailboats showcasing their draft in meters and the upwind speed (beat velocity made good, Knots) they can achieve at 6 knots wind speed.

To me, the relatively strong correlation between draft and beat VMG was surprising. I would have thought that mainsail area and boat length are more significant. However, upon further reflection, this might be more correlation than causation to a certain degree. Having a deep draft is unfavorable because it restricts access to many harbors, marinas, and beautiful bays. Therefore, the draft of a sailboat is a significant trade-off and is always carefully considered by the designer. Only if a deep keel is required for performance reasons will it be built.

Advanced Models

Even when combining these three features into a prediction, there's room for improvement. For our final model, we'll incorporate all the features available in the dataset to form a robust predictive model. To obtain more data (more is always better), we'll also use the other available wind speeds from the VPP data, including 6, 8, 10, 12, 14, 16, and 20 knots. For this, we'll employ a neural network, a unique architecture capable of representing complex data relationships, exactly what we need. After training the model using data from approximately 11,000 boats, we validate our model with the remaining 2,000 boats. The scatterplot below demonstrates a strong correlation (R² = 0.98) between the model's predictions and the actual boat speeds reported by the VPP programme.

linear regression of predicted vs measured Beat VMG for thousands of boats — Neural Network model predicting the beat VMG performance of ~2,000 test boats at wind speeds ranging from 6 to 20 knots.

Conclusion

In conclusion, we successfully employed 13,305 public records of ORC sailboat speed data to develop a predictive model capable of estimating run VMG for wind speeds ranging from 6 to 20 knots, based on a handful of input parameters, such as boat length, beam, draft, sail areas, and more. Excited to see how these factors play out for your sailboat? Try our calculator tool now and chart your sailboat's potential speeds!