TL;DR: The first thing we do when a new acne brief lands on our bench is run a 4-week accelerated screen at 40°C / 75% RH before we commit to anything
TL;DR: In aqueous gel systems, we’ve measured degradation rates as high as 18% active loss over 8 weeks at 40°C when the pH isn’t controlled tightly
Key Technical Parameters #
Acne and blemish formulations fail in ways that aren’t random. The failures cluster around three predictable zones: active degradation during storage, pH drift that kills efficacy before the consumer notices, and emulsion instability triggered by the very actives that make the formula work. Brand developers in the mass-market and clinical skincare segments feel this most acutely, because the margin for error is narrower when you’re combining two or more actives with conflicting stability requirements. What follows is how we diagnose these failures in our lab, what measurable thresholds we use to flag them early, and what the corrective parameters actually look like in practice.
Measuring What Actually Fails: Stability Thresholds Across Active Classes #
The first thing we do when a new acne brief lands on our bench is run a 4-week accelerated screen at 40°C / 75% RH before we commit to anything. Not because four weeks tells you the whole story, but because it separates the obvious failures from the ones worth investing stability resources in.
Benzoyl peroxide (BPO) is the most reactive material we regularly work with in this category. In aqueous gel systems, we’ve measured degradation rates as high as 18% active loss over 8 weeks at 40°C when the pH isn’t controlled tightly. Keep the formulation between pH 4.5 and 5.0, and that number drops to under 6% in the same timeframe. The mechanism is straightforward: BPO oxidizes more rapidly in alkaline conditions, and many of the humectants and thickeners brands want to combine with it (glycerin, carbomer, certain polyglutamic acid derivatives) push the pH upward if you let them. We flag this in every BPO kickoff via our SF-04 stability intake checklist.
Salicylic acid behaves differently. It’s more forgiving on the oxidation front, but it presents a different problem: recrystallization at concentrations above 1.5% when solubilization chemistry is insufficient. We’ve seen this in formulations where the co-solvent system was optimized for cost rather than solubility. Propylene glycol at 8–10% is typically adequate for 1% salicylic acid, but at 2% you need to either increase co-solvent load or introduce a second solubilizer. Ethanol is effective but introduces its own challenges in certain markets.
Azelaic acid sits in a different category entirely. At concentrations of 10–15% (where it’s functioning as a cosmetic active rather than an OTC drug), the stability concern is not degradation but rather texture drift and particle size change in suspension systems. Our particle size target is D90 ≤ 25 µm on incoming material. When that slips above 30 µm, we see grittiness complaints within the first consumer use cycles.
| Active | Primary Failure Mode | Detection Threshold | Corrective Parameter |
|---|---|---|---|
| Benzoyl Peroxide (2.5–5%) | Oxidative degradation, active loss | >8% loss at 4 weeks / 40°C | pH 4.5–5.0; avoid alkaline co-actives |
| Salicylic Acid (1–2%) | Recrystallization, particle formation | Visual haze or crystals at 4°C cold cycling | Co-solvent ≥8% PG; ethanol if pH-compatible |
| Azelaic Acid (10–15%) | Particle size drift, texture change | D90 >30 µm on incoming QC | Tighten incoming spec; reassess milling |
| Niacinamide + BPO | Yellowing via niacin formation | ΔE >2.0 on colorimetry at 8 weeks | Separate pH zones or eliminate combination |
| Tea Tree Oil (0.5–1%) | Oxidation of terpene fraction | Peroxide value >10 meq/kg | Antioxidant package; dark packaging |
The niacinamide + BPO combination deserves its own sentence. We almost always push back on this brief. The yellowing that develops from the niacin/BPO interaction is visible to consumers within 2–3 months of shelf life at ambient conditions. It’s not a safety issue, but it generates returns and complaints at retail. Our internal data from six batches across two years is consistent: colorimetric shift exceeds ΔE 2.0 by week 8 at 40°C in every case where both actives were in the same phase at similar concentrations.
Root Causes: Why Acne Formulations Fail at Scale #
This is where the real diagnostic work happens, and it’s almost never the active itself.
The most common failure mode we see on scale-up is pH drift post-fill. At lab scale, you’re checking pH in an open beaker with fresh material. On the production line, you’re filling into sealed containers, sometimes with headspace that contains a small amount of oxygen, and the pH reading you took at batch release is not the pH the consumer opens at month four. For BHA-based toners and serums, a drift of just 0.3–0.5 pH units upward can reduce the free acid fraction enough to measurably impact exfoliation efficacy. Per the Henderson-Hasselbalch relationship, salicylic acid at pH 3.5 has a free acid fraction near 86%; at pH 4.0 that drops to around 61%. That’s not a formulation change. That’s pH drift. And brands don’t see it until their consumer panel data comes back flat.
The second failure we track is emulsifier incompatibility with BPO. Anionic emulsifiers are unstable in high-BPO systems. We learned this firsthand during a development run for a 5% BPO moisturizer gel: the emulsion began showing phase separation at week 6 of the stability run, not because of temperature, but because BPO was slowly oxidizing the emulsifier backbone. Switching to a non-ionic emulsification system resolved it. Not a well-publicized issue, but once you see it, you don’t forget it.
The third failure is packaging-related and it comes up more than people expect. Tea tree oil and certain terpene-rich botanical actives migrate into low-density polyethylene packaging at rates that matter over a 12-month shelf life. We’ve measured residual tea tree oil in the product drop by roughly 20–25% in LDPE squeeze tubes by month 12, while the same formula in glass or PP showed negligible loss. The consumer experience difference is real: the scent profile changes, and the efficacy is reduced. We now specify minimum HDPE or PP for any formula with essential oil actives above 0.5%, and it’s flagged in our MP-11 material-packaging compatibility procedure.
The fourth failure mode brands consistently underestimate is preservative interaction with actives. Benzoyl peroxide at 2.5% and above can partially degrade parabens and certain phenoxyethanol-containing systems over time, reducing preservative efficacy. We’ve run challenge tests on BPO formulations at baseline and again at 6 months post-manufacture. In one lot series, initial challenge testing passed comfortably, but repeat testing at month 5 showed counts recovering more rapidly — not a failure, but trending in the wrong direction. The corrective action in that case was adding a chelating agent and revisiting the water activity, not reformulating entirely.
There is one failure we haven’t fully resolved to our satisfaction. Encapsulated salicylic acid systems — which several ingredient suppliers position as a way to achieve 2% on-pack with reduced irritation — show variable release performance depending on the shear history during manufacturing. Our internal data across four encapsulated SA systems from three suppliers shows release rates varying by as much as 40% batch-to-batch when high-shear mixing is used versus gentle paddle mixing. Whether this matters to skin efficacy, we genuinely can’t say with certainty. The clinical evidence for encapsulated versus free salicylic acid performance equivalence is thin. Our current approach is to include both free and encapsulated forms in stability panels until we have better in-use data.
Does Active Concentration Actually Drive Outcome — or Is Something Else Going On? #
The short answer: in most cases, vehicle and pH matter more than concentration within the permitted range.
A 2019 split-face randomized controlled trial (n=41, 12 weeks, published in the Journal of Drugs in Dermatology) compared 0.5% versus 2% salicylic acid in matched vehicle systems. The reduction in non-inflammatory lesion count was 34% for the 2% group versus 29% for the 0.5% group — statistically significant, but the difference was smaller than most brands expect when they brief us on “maximum strength.” What the trial didn’t test, and what we’ve observed in our own vehicle comparisons, is that a well-optimized 1% salicylic acid serum with appropriate penetration enhancement can outperform a 2% gel with inadequate skin contact time. The active percentage on pack is a marketing lever as much as a performance variable.
This is where the brief often goes wrong. Brands come in asking for the highest permitted concentration, and we redirect the conversation toward vehicle design and pH optimization. For our acid exfoliation technology platform, the default starting point is always pH and delivery system before concentration.
For context on what’s actually permitted across markets: the FDA Cosmetics Guidelines govern salicylic acid as an OTC drug active in the US at 0.5–2%; the EU Cosmetics Regulation 1223/2009 limits salicylic acid to 2% as a cosmetic (with specific restrictions for body application and prohibition in under-3s products); and the NMPA Cosmetic Regulation classifies BPO-containing products as special-use cosmetics requiring registration. Those regulatory differences force real formulation divergence across SKUs, not just label changes.
The honest observation from our side: brands sometimes build a global rollout plan around a single formula, then discover at the regulatory stage that the BPO concentration approved under the US OTC monograph isn’t registrable in China without a dedicated drug-adjacent pathway. That’s not a small problem at 500kg batch scale. We flag it at the brief stage, but we still see it slip through when timelines are compressed.
Our acne and blemish control work across different markets has made us fairly direct about this during kickoffs: pick your primary market first and build the formula there. Adapting for secondary markets is easier than reverse-engineering a formula that was built for the wrong regulatory context.
Industry practice varies on the question of combination actives. Some formulators layer BPO with niacinamide specifically for the anti-inflammatory effect while accepting the color instability, managing it through opaque packaging and stabilizers. Others avoid the combination entirely. We avoid it unless the brand has a compelling clinical rationale and is prepared for more complex stability management. That’s our position, and we hold it — but it’s not the only defensible position.
Formulation Notes for Brand Partners #
When you brief us on an acne or blemish control product, the first three questions we ask are: Which market is primary? What format — leave-on or rinse-off? And what’s the on-pack active story you’re building the brand around?
Those three questions change everything. A 2% salicylic acid serum for the US market goes through OTC drug process protocols. The same concentration as a cosmetic in the EU requires a different safety assessment pathway under SCCS Scientific Opinion precedents. And if China is in scope at any point, BPO is essentially off the table for a standard cosmetic registration.
The brief mistake we see most often is brands bringing in a formula they’ve sourced or partially developed elsewhere and asking us to “just stabilize it.” Nine times in ten, the instability is structural — either the pH isn’t buffered, the active and emulsifier system are incompatible, or the packaging was chosen before the formula was finalized. Stabilizing those formulas usually means reformulating them, and that resets the timeline. A clean brief with clear market and format scope saves six to eight weeks on average.
Timeline for a new acne active system: lab samples in 2–3 weeks, accelerated stability across packaging at 4–8 weeks, with 24-month real-time stability initiated concurrently at the point of formula lock. OTC drug registrations in the US add four to six months on top of that for documentation.
Frequently Asked Questions #
We want to go with 2% salicylic acid — do we need to register it as a drug?
A: In the US, yes, if it’s a leave-on product making acne claims — it falls under the OTC drug monograph. In the EU it can be registered as a cosmetic at 2%, but the safety assessment is heavier than for lower concentrations. The market determines the registration pathway, not the concentration alone, so nail down your primary launch market before you finalize the brief.
We’ve been told our BPO formula turned yellow in stability testing — is that fixable?
A: Yellow in a BPO formula is almost always the niacinamide interaction. If both actives are present in the same phase, the color shift is very hard to suppress — we’ve seen it exceed ΔE 2.0 by week 8 at 40°C reliably. The real options are to remove niacinamide, reformulate into a two-phase or encapsulated delivery where they aren’t in contact, or switch to an alternative sebum-regulating active that doesn’t react with BPO. Packaging opacity manages the consumer-facing issue but doesn’t fix the underlying chemistry.
Our previous supplier said the formula passed stability. Why are we seeing issues at retail?
A: Passing accelerated stability at 40°C / 75% RH for 4 weeks doesn’t predict everything, especially pH drift under real logistics conditions or packaging migration over 12+ months. Check whether the original stability study included cold-cycle testing (5 cycles between 4°C and 40°C is our standard) and whether the packaging used in the stability study matches what went to retail. Mismatched packaging is one of the more common root causes we find when we audit incoming stability data from prior suppliers.
What’s your MOQ for an acne serum and how long to first sample?
A: MOQ for a custom development run is typically 500kg for liquid formats, with lab samples in 2–3 weeks from formula brief confirmation. First pilot batch for stability entry runs at 50kg. If the formula includes an OTC drug active and you’re targeting the US market, add four to six months for documentation on top of the formulation timeline.
Should I worry about the preservative system in a BPO formula?
A: Yes, and this is the one brands most often skip past during the brief. BPO at 2.5% and above can degrade certain preservative systems over time, reducing efficacy against microbial challenge later in shelf life. We require repeat challenge testing at month 5 for all BPO-containing formulas — not just at baseline — and we’ve caught marginal preservation performance in that window before. If your current supplier only challenged at T0, that’s a gap worth addressing before scale-up.
Have a product concept in mind? Contact our formulation team to request a complimentary brief review.
We had a BPO 5% gel we were co-developing with an OEM in Guangzhou and they were sourcing a carbomer blend that was pushing our pH to 5.8–6.1 consistently across three batches. We didn’t catch it until month two of stability because nobody flagged the pH drift in the interim QC reports. By week 8 at 40°C we were sitting at 22% active loss, product was effectively non-compliant for the OTC monograph, and we’d already done a pre-production run of 5,000 units.
The 18% active loss figure tracks with what we saw when we didn’t catch a pH excursion until week 6 of a real-time study — by then the BPO had degraded past the point of recovery and we had to reformulate from scratch. We now flag anything drifting above pH 5.2 at the 4-week accelerated read, which has caught two batches from our Guangzhou supplier that looked fine on incoming CoA but weren’t behaving once we got into actual processing conditions.
That 18% active loss figure tracks with what we saw on a 5% BPO gel we took through EU notification in 2021 — pH kept drifting above 5.2 during fill because the batch size changed and we didn’t revalidate the neutralization step, ended up pushing launch by 11 weeks.
Our 2.5% BPO wash took 11 months from brief to launch and the stability package alone ran to 9 months because we had two real-time timepoints fail incoming pH spec at month 3 and had to resubmit the whole dossier. That cold cycling step for the salicylic acid screen added another 6 weeks we hadn’t budgeted for in the project timeline.
The part about pH 4.5–5.0 being the control window for BPO stability has direct implications for what you can actually claim on-pack — if you’re targeting a “24-hour blemish control” positioning for a 2.5% BPO product, you need retention data at that timepoint to defend it, and that data only means anything if your real-time stability was run at the pH range the article specifies. We’ve had a retailer in Olive Young push back on exactly that kind of claim because our dossier showed pH excursions during the study period, which effectively invalidated the active concentration assumptions the claim was built on.
When you’re running that initial 4-week screen at 40°C / 75% RH, are OEMs typically expected to generate that data internally, or do most K-beauty brand developers commission it through a third-party lab like Intertek or SGS Korea to keep it independent for the MFDS dossier?
The azelaic acid row is where we’ve been burned most recently — D90 creeping past 30 µm wasn’t something we caught until a consumer use panel flagged grittiness at week 8 of a 12-week in-use study, by which point we’d already committed to a production run with that milling spec.
The salicylic acid recrystallization point is something we spent a frustrating few months on with a 2% BHA toner — we were running PG at 6% and cold cycling kept throwing visible haze by day 3, which our OEM initially blamed on the humectant blend. Bumping to 9% PG cleared it, but then we had a slip/feel complaint from the consumer panel that took another two rounds to resolve.
Niacinamide compatibility is what burned us — we were developing a 2.5% BPO gel with 5% niacinamide for a prestige line launching in Sephora US and EU simultaneously, and nobody flagged that the niacinamide was buffering our pH up toward 5.6–5.8 consistently by week 4 of the accelerated screen. We didn’t catch it until the 8-week RIPT read-out came back showing active content at 79% of label claim, which blew our substantiation for the efficacy claims we’d already written into the sell-in deck. Reformulated without the niacinamide, pushed launch by five months, and lost the autumn floor slot entirely.