Skip to main content

Changelog

Stay up to date with the latest features and improvements of the Mabyduck subjective testing platform.

May 2026

May 12, 2026

Allowlist for endpoint restriction

0.9.13

Project and dataset secrets now support an allowlist for endpoint restriction.

May 4, 2026

Performance, video experiments, config files & development mode

0.9.12

This release comes with a variety of improvements.

  • We've implemented connection pooling to support a large number of simultaneous experiments.
  • We've built an optional scrubber for video experiments.
  • The Workflows API now supports config files.
  • The read-only mode of sessions now has a floating modal providing meta information.
  • We've built support for CSV files to specify externally hosted datasets.
  • We've implemented a new Development mode for embedded experiments.

April 2026

Apr 27, 2026

Okta SSO & sequential video player

0.9.11

Organisations can now use Okta OIDC for authentication.

Pairwise video experiments also now have a sequential video player as an option.

Apr 21, 2026

Session health scores, AI raters & pairwise video experiments

0.9.10

This release brings new session and rater health scores.

We've also improved the robustness of both AI raters and pairwise video experiments.

Apr 15, 2026

Secrets, plot filters & embedded experiments

0.9.9

You can now set up project-level secrets.

We've also made it possible to filter plots by parameters, and embedded experiments can now store custom metadata on slates.

Apr 8, 2026

Survey and API improvements

0.9.8

Audio surveys now support a new "highlight transcript" question type, allowing raters to highlight words in a transcript. Survey experiments also gain a new "Percent chosen" metric for radio buttons and checkboxes, with new plots of these metrics on the results pages.

Custom strategies can now also be fetched via the API.

Apr 4, 2026

Sampling and billing improvements

0.9.7

Active sampling strategies have been improved, with fixes to the initial phase of sampling unobserved conditions.

We've also enabled Stripe payments for organisation billing.

March 2026

Mar 31, 2026

Search, surveys, billing and video controls

0.9.6

This release comes with a variety of improvements.

  • You can now search by rater ID in the session browser
  • Survey experiments can now be configured via JSON config files in the dataset
  • We've also added billing management for enterprises, and Portuguese and Italian translations.
  • We've fixed video controls on introduction pages for experiments with hero videos.
Mar 26, 2026

Job controls, rater performance leaderboard and API improvements

0.9.5

Jobs now have a new "pending" status, and will automatically abort when an experiment is deleted. We've also added automatic checks for fragmentation of externally hosted mp4s, and disabled the ability to force-launch jobs via the API.

This release also brings rater performance leaderboards, and a handful of other API improvements.

Mar 19, 2026

API keys update

0.9.4

We've updated API keys to be associated with users

Mar 13, 2026

Webhooks & embedded experiment improvements

0.9.3

You can now set up project webhooks for dataset status changes, job completion, and session completion, with webhook and secret management available in both the API and admin.

We've also improved the reliability of embedded experiments including clearer API error reporting, wider time estimate support, automatic retrying of failed media parsing, and config files now filtered out of uniform strategy stimuli.

Mar 6, 2026

Full-screen & configurable hero videos

0.9.2

Hero videos can now be set to take over the full screen. They are also now configurable directly via the UI.

Mar 6, 2026

Checkboxes & Elo for MUSHRA experiments

0.9.1

Pairwise video experiments can now include arbitrary checkboxes.

We've also added Elo support for MUSHRA experiments.

February 2026

Feb 27, 2026

Hero media & experiment introductions

0.9.0

Experiments can now include a custom hero image or video via hero_media_url, along with an optional hero_caption. Experiment API responses also now return hero_media_url in all cases, with a null value when not set.

We've also increased the character limit for experiment introduction text.

Feb 22, 2026

Tie option for pairwise video experiments

0.8.9

Pairwise video experiments with a discrete or continuous response type now support a "tie" option.

Feb 22, 2026

Custom metadata, translations & leaderboards

0.8.8

A few more updates in this release

  • It is now possible to attach custom metadata to slates.
  • We've also added Japanese and Korean translations, and leaderboards are now available for ACR type experiments.
  • Experiments with a single condition also now support a uniform sampling strategy.
Feb 22, 2026

Elo scores, refunds & downloads

0.8.7

This release comes with a variety of improvements.

  • ACR type experiments now support Elo scores.
  • Aborted sessions are now automatically refunded, with no manual intervention needed.
  • The download button now returns JSON consistent with our API, replacing the previous CSV format.
Feb 6, 2026

Plackett-Luce, video & image experiments

0.8.6

We've introduced a method that reinterprets rankings as pairwise comparisons, improving the efficiency of the Plackett-Luce metric.

We've also replaced rating buttons with sliders in pairwise video experiments, and ACR image experiments now support continuous scores and sliders too.

Also fixed a hover state issue in bar plots.

Feb 6, 2026

Pairwise video experiments, API & infrastructure

0.8.5

Added new response types and support for multiple dimensions in pairwise video experiments.

It is now possible to create datasets, experiments, and jobs in a single API request. Custom strategies are also now supported.

This release added S3 storage support for customer data alongside a new Organisation model. We also fixed an issue with large file uploads in the browser, which now use multipart uploads.

January 2026

Jan 25, 2026

Scale-to-fit mode

0.8.4

Added a "scale to fit" option for ACR image experiments.

Jan 21, 2026

Self-hosted datasets

0.8.3

It is now possible to use self-hosted datasets by providing a list of URLs, instead of uploading media files directly to us.

We also improved our API, and it is now possible to launch jobs where previously it was only possible to configure drafts via the API.

Jan 13, 2026

Embedded experiments

0.8.1

Our embedded experiments are now widely available. This type of experiment uses JavaScript to include arbitrary content in experiments, and is ideal for running interactive studies.

December 2025

Dec 19, 2025

Survey experiments

0.7.9

We released new types of experiments that allow the configuration of arbitrary surveys below images, audio, or video.

Dec 16, 2025

Improved support for large numbers of conditions

0.7.8

We improved our support for datasets with very large numbers of conditions. This is useful, for example, when you want to collect labels for training and need to label a large number of audio, images, or videos that are not AI-generated.

Dec 12, 2025

Strategy filters

0.7.7

Selection strategies have received more configuration options. For example, it is now possible to evaluate only a subset of a dataset. It is also possible to always include one method in pairwise comparisons against other methods.

This release also makes it possible to scale (instead of cropping) images in pairwise image experiments.

November 2025

Nov 29, 2025

Plots on rubrics

0.7.5

We added the ability to add configurable plots to rubrics and leaderboards.

It is now also possible to create draft experiments and jobs via our API.

Nov 21, 2025

Public launch 🚀

0.7.4

Today, we are opening up Mabyduck to everyone.

Nov 14, 2025

Design improvements

0.7.3

We made small tweaks to our design and changes to our backend to prepare for a public launch.

Nov 7, 2025

Improved session browser

0.7.2

This release contained several improvements:

  • An updated session browser, which makes it easier to see demographic and other meta information at a glance.
  • We added the option to filter leaderboards to include or exclude certain conditions.
  • We now automatically fragment mp4 files uploaded to our platform. Our pairwise video experiments previously required you to upload already fragmented videos.

October 2025

Oct 31, 2025

Confidence regions in line graphs 📈

0.7.1

We added optional confidence regions to line graph visualizations of your results.

This release also adds support for references in ACR audio experiments.

Oct 24, 2025

Additional languages 🇵🇱

0.7.0

This release comes with a variety of improvements.

  • We refactored the way our rater pools work, paving the way for us to offer you highly customized rater pools.
  • We updated the UI of the pairwise video experiment to match the recently updated UI of pairwise image experiments.
  • We added support for 4 more languages, namely Spanish, Polish, Chinese, and Vietnamese.
Oct 17, 2025

Improved upload for large datasets

0.6.3

We improved the handling of very large dataset uploads through the browser. If a dataset upload is interrupted for any reason, it is now possible to resume uploads.

This version also adds a new Markdown input field for writing introductions.

Oct 10, 2025

Rater feedback function 💬

0.6.2

We introduced the ability for raters to leave feedback on individual slates and alert us to any potential issues with an experiment.

This version also updated the leaderboards' design.

April 2025

Apr 5, 2025

Internationalization 🇫🇷

0.2.6

We internationalized our experiments. In addition to English, we now support French and German.

Additionally, different experiments can now use different config files. This allows you to upload a single dataset with multiple config files for different experiments.

March 2025

Mar 17, 2025

Added configuration options

0.2.5

Pairwise image experiments now support references. We also introduced new configuration options for the MUSHRA experiment.

Mar 7, 2025

Pairwise image experiments

0.2.2

We introduced a new pairwise image experiment. We also added a way to preview images in datasets.

February 2025

Feb 23, 2025

Pre-screening

0.2.0

We have implemented our own pre-screening protocols. This allows us to provide you with a higher quality of raters whose ability and hardware enable them to detect fine differences between stimuli.

Feb 13, 2025

Crowd-sourced raters

0.1.6

It is now possible to launch experiments to crowd-sourced raters through our platform.

Feb 5, 2025

MUSHRA

0.1.5

We added configuration options to change how waveforms are rendered in MUSHRA experiments. In particular, it is now possible to only render the waveform of the reference so that raters can not draw conclusions based on the waveform.

January 2025

Jan 29, 2025

Audio experiments and API

0.1.4

Release 0.1.4 is packed with new features:

  • A new API for fetching results programmatically.
  • A new absolute category rating (ACR) experiment for audio stimuli.
  • Added configuration options for pairwise audio experiments, such as the option to vote "Tie" or checking if audio has been played for a given duration.
Jan 27, 2025

MUSHRA 🪲

0.1.3

We addressed some minor bugs in the MUSHRA experiment.

Jan 16, 2025

Added support for config files

0.1.2

Datasets now support config files. These can be used to change the interface for each slate. For example, to display text prompts next to stimuli.

Jan 6, 2025

Private beta 🐣

0.1.0

Today, we are excited to release a private beta version of Mabyduck to our design partners.