HAProxy Technologies

Load Balancing RADIUS with HAProxy Enterprise UDP Module

Daniel Skrba and Dragan Dosen — Tue, 06 Aug 2024 12:00:00 +0000

]]> HAProxy Enterprise now supports RADIUS load balancing with its new HAProxy Enterprise UDP Module. This blog post addresses some of the challenges with implementing RADIUS load balancing and how you can get started with the new module below.

What is RADIUS load balancing?

RADIUS (Remote Authentication Dial-in User Service) is a protocol that communicates over UDP to manage authentication requests for users accessing a network. RADIUS load balancing distributes these authentication requests across backend servers, ensuring the high availability of your RADIUS services.

By preventing servers from becoming overwhelmed, RADIUS load balancing reduces the chance of blocked user authentication, which can lead to significant operational disruption. This makes RADIUS load balancing ideal for organizations that require robust access control mechanisms.

What are the challenges in implementing RADIUS load balancing?

RADIUS load balancing presents challenges due to its reliance on the UDP protocol, which offers no error correction or retransmission of lost packets for authentication requests. Therefore, it’s important that your load balancer be capable of maintaining a reliable delivery rate and high throughput while managing large volumes of RADIUS traffic.

Furthermore, integrating RADIUS load balancing within a network of other load balancers adds another layer of complexity to your infrastructure. A flexible load balancer capable of unifying UDP, TCP, and HTTP under a single solution can address this added layer of complexity.

As with other protocols, RADIUS load balancing requires health checks and failover mechanisms to ensure high availability, conduct maintenance, and redistribute traffic should a server fail. It’s imperative that your load balancer be capable of monitoring the health of backend servers to maintain authentication availability.

Lastly, as the number of users and devices in a network increases, your load balancer must be capable of seamlessly distributing traffic to new backend servers as you add them to support increasing traffic loads.

HAProxy Enterprise UDP Module

The HAProxy Enterprise UDP Module addresses the challenges that come with RADIUS load balancing (and more!). The module empowers users by exposing UDP services through HAProxy Enterprise, making it a suitable load balancer for time-sensitive applications such as RADIUS authentication traffic.

With the addition of the UDP module, HAProxy Enterprise unifies TCP, HTTP, and UDP load balancing. This single-solution approach to Layer 4 and Layer 7 load balancing reduces infrastructure complexity.

]]> HAProxy Enterprise UDP Module offers faster and more reliable load balancing than other software load balancers. It also delivers dynamic traffic routing as you scale out your backend servers to handle increasing authentication requests. Health checks to monitor server health and the capability to conduct server maintenance without service interruption ensure the constant availability of your RADIUS services.

]]> Getting started: RADIUS load balancing with HAProxy Enterprise

To get started with RADIUS load balancing, we’ll first have to install the UDP module. Our UDP module documentation covers everything you need to know about installing the UDP module. Once the UDP module is installed, you can configure RADIUS load balancing.

Configuring RADIUS load balancing

Now, configure the UDP module to load balance RADIUS authentication traffic. The UDP module listens on configured ports, balances requests to RADIUS servers, and returns responses to clients.

Use the example configuration below to start load balancing RADIUS traffic. In the example, the load balancer routes traffic to both RADIUS authentication (1812) and accounting (1813) ports:

]]> blog20240806-01.cfg]]> Understanding the configuration

The configuration is broken up into two sections using the keyword udp-lb and named radius-auth and radius-accounting.

We’ve specified a dgram-bind on all interfaces on port 1812 for radius-auth and port 1813 for radius-accounting. Make sure the ports you specify match the ports defined in your RADIUS configuration (1812 and 1813 are the RADIUS defaults).

The load balancing algorithm is set to source so that requests from the same client are routed to the same server (if the target server is healthy, and the number of running servers is not changed).

The option udp-check directive enables UDP-based health checks. The check keyword on each server line activates these health checks for individual servers, verifying their availability and health.

We did not set proxy-requests so that all requests from the same client are routed to the same server (unless the client session has expired or the server is no longer available). Similarly, we did not set proxy-responses as the RADIUS server will provide multiple responses.

Finally, we listed three servers that will receive the load balanced RADIUS traffic. These servers have been configured as RADIUS servers and will respond to requests on the default RADIUS ports 1812 and 1813.

Now you’re ready to load balance RADIUS traffic!

RADIUS accounting

Your RADIUS servers should be aware of the load balancing setup, especially for accounting purposes. RADIUS accounting involves two types of messages: session start and session end. The RADIUS servers should be able to log this information in a shared storage (usually a database) to maintain consistency and accuracy in accounting records. Without this shared storage, you might face discrepancies in accounting data due to the distributed nature of the load balancing.

Conclusion

With HAProxy Enterprise and the UDP Module, it’s easy to set up load balancing for your RADIUS servers and benefit from high-performance load balancing with reliable UDP packet delivery in a unified solution. Health checks and failover enable high availability, while dynamic backends and traffic routing enable seamlessly scalable infrastructure.

For more information on RADIUS load balancing and the directives used for configuration, visit our UDP module documentation page on load balancing UDP with HAProxy Enterprise.

We’re working on ways to make configuring RADIUS load balancing even easier. Subscribe to our newsletter below to stay tuned for the latest updates.

]]> Load Balancing RADIUS with HAProxy Enterprise UDP Module appeared first on HAProxy Technologies.

How to Reliably Block AI Crawlers Using HAProxy Enterprise

Jakub Suchy — Thu, 18 Jul 2024 08:13:00 +0000

]]> The robots.txt file is a time-honored point of control for website publishers to assert whether or not their websites should be crawled by bots of various kinds. However, it turns out that AI crawlers from large language model (LLM) companies often ignore the contents of robots.txt and crawl your site regardless.

Now you may indeed want your site crawled by some or all AI crawlers, for reasons that may include:

Wanting LLMs to share accurate information provided on your site
Wanting to support AI and LLMs in general

However, you may also want to block some or all AI crawlers, for reasons that may include:

The desire to have LLMs pay you for access to the content on your site, as some news sites have recently arranged with specific LLM providers
Overall concern about bots hurting your site’s performance
Not wanting to accommodate AI and LLMs in general

Whatever your reasoning, you may want to block these crawlers—but to do so, you'll need to take additional steps. HAProxy Enterprise enables you to do this, as we describe in this article.

HAProxy Enterprise provides you with several advantages when you block crawlers from any source, including LLMs. HAProxy Enterprise sends zero traffic to third parties for classification of bots. All the work happens within your own systems, so you can block the crawlers you want to block without incurring extra latency, all while avoiding unnecessary compliance and security concerns.

Understanding bot management within HAProxy Enterprise

HAProxy Enterprise includes the powerful HAProxy Enterprise Bot Management Module, which provides fast, reliable, and flexible identification and categorization of bots attempting to access websites or applications. It also helps make routing decisions for bot traffic and various crawlers—whether they announce themselves or not. Let’s explore how our bot management features can block AI bots from accessing your site.

You might be wondering why we wouldn't just detect User-Agent strings. User Agents are some of the easiest things to fake in requests. Meanwhile, our Bot Management Module uses multiple techniques to verify that the specific User-Agent string in each request is authentic. That’s why we'll refer to “verified” categories of bots below.

Installing the HAProxy Enterprise Bot Management Module

Our bot management documentation (login required) offers step-by-step instructions for basic module installation. Note that you'll need either an HAProxy Enterprise trial or an active subscription to access these resources. However, we'll share some important details here to help demonstrate how the HAProxy Enterprise Bot Management Module works.

Generally, we recommend first installing the latest version of the HAProxy Enterprise Bot Management Module and downloading the latest version of the corresponding data file containing the magic of bot management detection. Next, you'll add a basic scoring configuration to a single frontend or all frontends, shown below:

]]> blog20240718-01.cfg]]> From now on, your requests will contain data telling you how your traffic was scored. Let’s look at blocking AI bots next.

Blocking AI bots

]]> The HAProxy Enterprise Bot Management Module gives you a lot of information, including:

Bot management scores
Detection information for verified crawlers such as Googlebot
Detection information for verified bots such as an AI crawler

We can simply use a verified bot category to detect when an AI crawler accessed your site. Add the following line of configuration to your frontend section, immediately after the filter botmgmt line:

]]> blog20240718-02.cfg]]> The acl is_ai_bot will be true if we've detected an AI crawler. To finish up, you can now add an extra line to deny this traffic:

]]> blog20240718-03.cfg]]> Denying individual AI crawlers

A universal blocking strategy for all crawlers isn't always necessary. For example, maybe you want to generally allow AI crawlers but specifically deny traffic from ClaudeBot. You can do this by adding two lines to your configuration:

]]> blog20240718-04.cfg]]> In the same way, you can use an ACL to detect and block ChatGPT-User and other similar bots.

Reviewing our complete configuration

We've identified the various configuration components behind bot management in HAProxy Enterprise, and how these form a simple-yet-capable blocking strategy for AI crawlers. Here's the full configuration example as it appears within your file:

]]> blog20240718-05.cfg]]> Tackle tomorrow's bots and crawlers today

Using HAProxy Enterprise Bot Management Module, you can easily block traffic from bots, verified crawlers, and/or AI crawlers. We've outlined how just a few lines of code with HAProxy Enterprise can noticeably improve your overall bot management strategy, while safeguarding your content and application infrastructure against modern threats, which might include AI crawlers. This specialized blocking will become even more critical as LLMs grow more popular and plentiful.

To learn more about HAProxy Enterprise Bot Management Module and HAProxy Enterprise's built-in security features, check out our security solution and the HAProxy Enterprise datasheet.

]]> How to Reliably Block AI Crawlers Using HAProxy Enterprise appeared first on HAProxy Technologies.

The Customer Feedback Loop That Makes HAProxy G2’s Satisfaction Score Leader

Floyd Smith — Wed, 17 Jul 2024 01:46:00 +0000

]]> "Summertime, and the livin’ is easy," say the lyrics to a much-loved song. And HAProxy’s user ratings on the quarterly G2 Grid^® Reports for Summer 2024 are hot indeed with a Satisfaction Score of 98!

G2 Grid^® Reports are quarterly reports from G2.com, the world’s leading platform for real user reviews of business software. HAProxy’s press release describes the results in detail. For this summer, HAProxy achieved a record 18 Leader positions.

This includes long-time leadership in Load Balancing and Container Networking, plus new Leader positions in major areas such as API Management, DDoS Protection, DevOps, Web Application Firewall (WAF) and Web Security. G2 has also awarded HAProxy 47 badges, recognizing achievements in key areas such as Best Results, Best Relationship, and Best Usability.

High satisfaction helps HAProxy lead the Leader quadrant

A G2 Grid^® is an interactive tool that can tell you a great deal about a product. For instance, the default G2 Grid^® for Load Balancing shows HAProxy in the Leader quadrant, with an amazingly high Satisfaction score (the left-to-right axis) of 98 and a Market Presence score well above average. To learn more about these scores, check out G2’s Research Scoring and Methodologies guide.

When using the Live score, you can also select for the market size of responding companies - Small Business, Mid Market, or Enterprise. Use this selection option to find the grid result and reviews appropriate to your company size.

Then, try the Trending option from G2. The picture shifts to show who’s hot, as you see in Figure 1. The vertical axis is replaced by a Momentum score. HAProxy is now #1 on each of the axes, as well as overall.

G2 results show HAProxy as a leader in load balancing, container networking, API management, DDoS protection, and web application firewall - and more. Check out the G2 reviews for HAProxy yourself and see what you think.

]]>

Figure 1: Momentum Grid® Report for Load Balancing

]]> How HAProxy listens to user feedback to improve customer satisfaction

How did HAProxy reach a customer satisfaction of 98 for load balancing in the G2 Grid^®Reports? Sure, good products have happy customers. But how can customers be that happy?

Part of the answer is simple. HAProxy has achieved high customer satisfaction ratings on G2 by responding to the feedback that customers leave in G2 Reviews.

One of the most challenging aspects of these reviews, for vendors, is that they always include the dreaded question: “What do you dislike about ?” As a vendor, you see your product’s weak spots highlighted in customer reviews, for all to see. 😬

But, here at HAProxy, we listen to all the feedback. Then we work to “turn that frown upside down” by addressing customer concerns.

Here are a few comments from the “dislike” question for HAProxy load balancer reviews in 2020:

“No UDP support.”
“A GUI for configuration would be a plus.”
“There are no historical stats kept by the software.”
“I've never had HAProxy fail in the 10+ years I've been using it.”

OK, that last one was pretty nice – but what about the others?

Here’s what has happened since those reviews were posted:

In November 2022, we announced HAProxy Fusion Control Plane, which includes both historical stats and a GUI.
And just recently, we released HAProxy 2.9, which includes – wait for it – UDP support!

Recent G2 reviews from the Summer 2024 period show that these products are received positively:

“Fusion is amazing - alone worth the license cost”
“Everything like, no cons detected, please use it”

So that’s part of how HAProxy achieves such solid results: by listening to customers; feeding the input into the product roadmap; and launching new products and improving existing ones to give customers more of what they need.

What’s a user to do with G2?

What are you to make of these outstanding results? For one thing, HAProxy cares about what users think and say. We are constantly seeking customer feedback, input, and ideas.

For another, HAProxy has been, is, and always will be obsessively customer-centric. Not only do we put in the work up front; we follow up, with the G2 Grid^® Reports as a critical source of vital customer feedback.

And HAProxy is the undisputed leader in multiple categories. For organizations needing load balancing, API gateway, WAF, or DDoS protection, HAProxy Technologies should be on any team’s shortlist.

]]> The Customer Feedback Loop That Makes HAProxy G2’s Satisfaction Score Leader appeared first on HAProxy Technologies.

Create an HAProxy AI Gateway to Control LLM Costs, Security, and Privacy

Jakub Suchy — Thu, 11 Jul 2024 08:14:00 +0000

]]> The introduction of ChatGPT two years ago caused sharply increased interest in (and use of) large language models (LLMs), followed by a crush of commercial and open source competition. Now, companies are rushing to develop and deliver applications that use LLM APIs to provide AI functionality.

Companies are finding that AI-based applications, just like conventional applications, have deliverability concerns. For most conventional applications, an API gateway is a vital tool for deliverability. Something very similar is needed for AI applications, but some of the specifics are different. So, a new form of API gateway, called an AI gateway, is coming to the fore. HAProxy is one of the companies pioneering the development and delivery of this new type of gateway.

What’s new with AI gateways?

One key difference between a conventional API gateway and an AI gateway is in the area of rate limiting. API gateways implement rate limiting for requests based on the number of requests per IP address, which contributes to balanced service delivery and is an initial step in limiting the impact of malfunctioning or malicious software.

While AI gateways also require rate limiting, the limits imposed should be based on the API key and the number of tokens used rather than on the requests per IP address. This actually provides a higher level of control than is possible with conventional rate limiting, since the API key and token counts are specific to a given machine, whereas IP addresses don’t always represent a single machine.

Other needs such as data loss prevention, API key management, retry support, and caching are more or less the same. However, the way these similar requirements are implemented introduces some differences, which we'll discuss later.

Implementing an AI gateway in HAProxy Enterprise

]]> In this post, we'll build an AI gateway using HAProxy Enterprise. We'll showcase the steps using OpenAI APIs.

In part 1 of this guide, we'll implement the basics:

Creating a gateway in front of OpenAI and providing rate limiting per API key

In part 2 (coming later), we'll tackle more advanced steps:

Enhancing the configuration for a gateway using multiple API keys, in front of vLLM
Adding API key encryption
Implementing security for personally identifiable information (PII) protection and data quality at the gateway level

To make all this happen, we'll use HAProxy Enterprise with HAProxy Fusion Control Plane (included with HAProxy Enterprise). Together, these products make it easy to implement an AI gateway that's performant and scalable. Let's first review a few details to show you how everything will fit together.

Challenges facing AI applications

We won’t have a full picture of the unique challenges AI applications face until the industry has more experience creating and delivering them at scale. However, at this early stage, some important concerns have already arisen:

Cost control is paramount – Developers usually aren’t (and maybe shouldn’t be) aware of the full expense and large per-token costs of running an LLM model.
API keys get compromised – LLM platforms such as OpenAI let you create different API keys per developer—a basic security measure. These keys, like any other API keys, can still be compromised or stolen. However, the urgency to enforce sophisticated key management and protection isn't quite keeping pace with today's usage trends. HAProxy can help you to bridge the gap.
API key quotas get used up – Some LLM platforms let you set rate limits for tokens. However, these limits are set globally and not per API key. Wherever developer-specific APIs are available, a single developer can use up most or all of your daily token quota.
Security and PII concerns must be addressed – Users' prompts must not include PII, such as social security numbers or credit card information.

You can begin to address these concerns effectively using the AI gateway that we'll show you how to create here, improving your delivery of LLM-powered AI applications.

New to HAProxy Enterprise?

HAProxy is the world’s fastest and most widely used software load balancer and the G2 category leader in API management, container networking, DDoS protection, web application firewall (WAF), and load balancing. HAProxy Enterprise elevates the experience with authoritative support, robust multi-layered security, and centralized management, monitoring, and automation with HAProxy Fusion. HAProxy Enterprise and HAProxy Fusion provide a secure application delivery platform for modern enterprises and applications.

To learn more, contact our sales team for a demonstration or request a free trial.

Key concepts before getting started

]]> The following concepts are necessary to fully understand this guide. You may need to adapt these concepts and the capabilities of HAProxy Enterprise and HAProxy Fusion to your specific use case.

Storing and encrypting API keys

Storing unencrypted API keys on a load balancer is never a good idea, even though you can use them as keys for rate limits and more.

Since we're just implementing the basics needed for an AI gateway in this guide, we'll simply accept a key for OpenAI, hash it, and use the result as a key for rate limiting.

In part 2, we'll demonstrate how to encrypt your keys and create an intermediate key, so even your application developers can't access real API key values—a production-ready approach.

Quotas and global rate limiting

HAProxy includes stick tables that can be used as counters for rate limiting. Our use case requires a scalable active/active AI gateway, so we need all HAProxy instances to aggregate—and be aware of—each other's rates.

Our Global Profiling Engine (GPE), which comes pre-installed and integrated with HAProxy Fusion, provides exactly that. This feature automatically aggregates all token rates across every HAProxy Enterprise instance. GPE will later include ranking capabilities out of the box, enabling it to determine the most-used keys within your organization, as well as a convenient web endpoint for integration with any number of other systems. GPE is also available as a standalone module. Future HAProxy Fusion Control Plane releases will also provide similar metrics within customizable dashboards, enhancing observability for your AI-powered applications.

Uniquely, GPE can be configured to aggregate historical rates. Static token limits are really hard to define correctly in unpredictable environments, so we'll dynamically rate limit users based on their usage. In our example, we will impose rate limits when current usage exceeds twice the 90th percentile of the general usage during the same time period on the previous day. You can implement different and more sophisticated controls in your own AI gateway implementation.

Metrics and statistics

We also want to use HAProxy's extensive logging capabilities to collect API usage metrics. Specifically, we'll log the total amount of prompt tokens and completion tokens consumed by all users.

We have two flexible collection options, which can be used separately or jointly:

Logging the statistics into the standard HAProxy log, then parsing the logs
Asynchronous, real-time funneling of token metrics into an external endpoint such as Grafana, an HTTP endpoint for TimescaleDB, or others (not covered in part 1)

PII protection

We want to detect social security numbers, credit card numbers, and other types of potentially sensitive data.

How to implement HAProxy Enterprise as an AI gateway

This guide describes how to use HAProxy Enterprise, which has built-in API gateway functionality, as an AI gateway instead. The concept is the same, and we'll explore tokens, API key-based limiting, and other differences as they arise.

Step 1: API key implementation

Let’s look at implementing the API key authorization. In this first version of our AI gateway, your users will continue using standard OpenAI keys, but we'll implement additional controls on top of those keys. These include the following:

Denylists to outright block any compromised keys
A quota or a rate limit per key

We never recommend storing unencrypted OpenAI keys (or keys of any kind) on your load balancer, so we'll use hashing. We will only store the hashes on each request. HAProxy will receive the OpenAI key, hash it using SHA-256, and compare the hash to the stored data.

Hashing API keys

Let’s say your requests contain an HTTP Authorization header with the OpenAI key. We can get the value of the key, apply a SHA-2 256 digest, then return the value to a variable inside HAProxy:

]]> blog20240711-01.cfg]]> Denying compromised API keys

Let’s say you have a client with an OpenAI key (hashed as 5fd924625a10e0baacdb8) that’s been compromised and must be blocked at the load balancer layer. We're going to create a .map file called denylist.acl with a single compromised API key per line:

]]> blog20240711-02.txt]]> You can generate the hashes for your file with a variety of tools, such as this hashing calculator on GitHub.

For security reasons, HAProxy Enterprise is designed not to use any I/O once loaded. HAProxy Fusion also ensures that HAProxy Enterprise always has an up to date denylist.acl file in memory. When needed, it’s easy to write a script to make changes to this file using HAProxy Fusion’s native API.

You can now block any key on denylist.acl as follows in your HAProxy configuration:

]]> blog20240711-03.txt]]> And that’s it! Every time you add a hashed key to the denylist, HAProxy will block it from using your service. You can effectively block OpenAI API keys without actually storing unencrypted copies on your instance.

Step 2: Quotas and rate limits

Next, we want to implement quotas or rate limits per key. Enterprise-grade quotas require the following properties:

Quotas or rate limits should be persisted across all load balancer instances, whether they are configured as Active/Active or Active/Passive.
Quotas or rate limits should be flexible and configurable to any specification. In this example, we'll use per-minute limits and daily limits for both prompt tokens and completion tokens. This matches OpenAI's implementation while also letting you control usage per individual API key, instead of only at the account level.

We'll start with a .map file to define our limits. I’ve chosen the following format:

]]> blog20240711-04.txt]]> Next, let’s look at an example .map file named rate-limits.map:

]]> blog20240711-05.txt]]> For illustration, this file really represents a table of limits that looks like this:

Key Hash	Per Minute Prompt Token Limit	Per Day Prompt Limit	Per Minute Completion Limit	Per Day Completion Limit
`5fd924625a10e0baacdb8`	100	200	1000	50000
`813490e4ba67813490e4`	300	600	2000	30000

We absolutely want to support quotas or rate limits across all load balancers in an Active/Active configuration. By default, HAProxy Enterprise's stick tables are used for rate limiting per instance—each instance of HAProxy can configure a local stick table. To comply with the above limits (for example 100 prompt tokens per minute), you’d need to set the actual limit to be 100 divided by the number of HAProxy instances in your cluster. However, you can't easily autoscale, and you would need to constantly update calculations. Mathematical errors and other issues would be hard to detect.

It's Global Profiling Engine (GPE) to the rescue! In an Active/Active load balancer configuration, where traffic is spread across two or more load balancers in a round-robin rotation, GPE ensures that each load balancer receives a client's aggregated requests. This is true even when those requests were routed to another load balancer in the cluster.

I'm going to perform a director's cut here: if you're using HAProxy Fusion, you're already using GPE by default. It's automatically configured for each cluster of load balancers you create. If you aren't using HAProxy Fusion, you can easily install the GPE module by following the instructions in our documentation.

Defining stick tables

Let’s focus solely on the configuration and define HAProxy stick tables to hold our prompt rates. We'll need four tables with two variants—or eight total tables. The variants will be local and aggregate. It's important that on the HAProxy instances we'll write (or track) our data into the local tables, yet read from the aggregate tables. The Global Profiling Engine supplies these aggregate tables, which contain all aggregated rates across all instances.

To get started, we'll need two of each of the following:

Per-minute prompt tokens tables
Per-day prompt tokens tables
Per-minute completion tokens tables
Per-day completion tokens tables

Here's a sample definition of all of them:

]]> blog20240711-06.cfg]]> This seemingly complex set of tables is actually deceptively simple. We're intentionally using more tables to support all notable limits, since OpenAI itself supports both per-minute and per-day limits. Prompt and completion tokens are handled separately.

Two differences exist between our per-minute and per-day tables:

The per-minute table records store rates per minute and expires every 60 seconds.
The per-day tables store daily rates and expire every 24 hours.

Fetching rate limits from the map file

Next, we'll fetch the rate limits for each OpenAI key by looking up each hash in the .map file, and requesting the fields that we can use to set the maxrate variables.

The field(1,:) variable will return the first number from the .map file for the key, delimited by a colon:

]]> blog20240711-07.cfg]]> If the key isn't in the .map file, the default result will be zero. That means no requests will be allowed.

Tracking current requests

Once we've fetched our rate limits from the .map file we must carefully track the current requests using a sticky counter:

]]> blog20240711-08.cfg]]> Getting current request rates

Next, request the current rates from the associated aggregated tables:

]]> blog20240711-09.cfg]]> The second number in sc_gpt_rate(0,X) refers to the corresponding track-scX statement from the previous sticky counter. For example, sc_gpc_rate(0,1) and track-sc1 are coupled.

Calculating if over the limit

Finally, let's determine if the current request exceeds the rate limit we've set. We're effectively making this comparison in pseudo-code:

]]> blog20240711-10.txt]]> We'll use this calculation for two purposes:

To initially deny the request if the customer exceeds their limit
To add the amount of tokens returned into the stick table after getting a response

Here's how we'll deny requests if they exceed the token limit, which subsequently triggers the 429 Too Many Requests error code:

]]> blog20240711-11.cfg]]> Crucially, we don’t actually know yet if the current request will exceed the limit. For performance reasons, we're relying on the OpenAI response itself to tell us how many prompt and completion tokens are consumed. This means these limits are actually eventually consistent.

You could run a tokenizer on each request to improve consistency, but doing so would be slow. Eventually, consistent Active/Active rate limits will work the best.

Getting token usage from JSON responses

We're finally nearing the finish line! The OpenAI HTTP response will contain information about per-request token consumption for prompt and completion. We can use HAProxy’s JSON parser to get the details:

]]> blog20240711-12.cfg]]> There are some potential limitations to be mindful of. The above code will only inspect the allocated buffer size (or tune.bufsize in HAProxy) in bytes of the response. If your prompts generate huge responses, you’ll need to increase your tune.bufsize to capture the whole body.

]]> Finally, we can add the prompt and completion tokens to our counters—but only if we adhere to the limit, since we'd otherwise never stop counting:

]]> blog20240711-13.cfg]]> Conclusion

In this blog, we've introduced the concept of an AI gateway (quite similar to widely used API gateways) while implementing API key checking and token-based rate limiting. This element is unique to the AI gateway concept, where we care much more about rate limiting based on tokens as opposed to requests. This includes both incoming (prompt) and outgoing (completion) tokens.

In part 2 (coming soon), we'll implement an AI gateway in front of vLLM and/or Ray, while enforcing PII protection and more. Subscribe to our blog to make sure you stay updated!

]]> Create an HAProxy AI Gateway to Control LLM Costs, Security, and Privacy appeared first on HAProxy Technologies.

Scalable AWS Load Balancing and Security With HAProxy Fusion

Amina Mujkanovic and Jakub Suchy — Tue, 09 Jul 2024 12:38:00 +0000

]]> Amazon Web Services (AWS) is renowned for providing a comprehensive ecosystem that supports the computational and data storage needs essential for developing, deploying, and managing applications across different regions, ensuring that users experience fast and seamless service. However, as applications evolve, and especially when traffic increases significantly, management complexity increases, necessitating more intricate setups, additional attention to security, and the ability for the application to work across various regions. These changes may increase latency as well.

This complexity is particularly evident when Kubernetes, a system designed to automate the deployment, scaling, and management of containerized applications, is integrated into AWS environments. Despite the fact that its intended effect is to simplify networking, Kubernetes often adds layers of complexity to AWS load balancing, especially in multi-cluster, multi-region setups. Businesses find themselves facing the daunting task of managing application and load balancer sprawl, a challenge HAProxy is uniquely equipped to address.

HAProxy is sometimes referred to as a load balancer, but its capabilities go well beyond that. It serves as a solution for many of the challenges faced when using modern web infrastructure, in areas that include security and support for agile infrastructure.

This blog post describes how HAProxy supports scalability and security along with load balancing. This blog post summarizes Jakub Suchy’s talk on "Scalable load balancing and security on AWS with HAProxy Fusion" from AWS re:Invent 2023, demonstrating in detail how HAProxy is revolutionizing AWS load balancing.

AWS load balancing in complex environments

Initially, deploying an application on AWS seems straightforward, but as businesses scale, complexity escalates.

Managing multiple applications across various AWS regions and legacy data centers introduces a new layer of complexity, necessitating advanced setups, enhanced security, and seamless cross-region operability.

The challenge of Kubernetes in AWS load balancing

Kubernetes is often introduced into an already complex environment to help with scaling. While serving this purpose, it can inadvertently add complexity to AWS load balancing across multi-cluster and multi-region setups.

Many users aim to have a Kubernetes cluster in every AWS region, so as to seamlessly move traffic between them during failures or upgrades. This leads to a profusion of load balancer instances. What starts as a manageable setup can quickly grow into thousands of load balancer, application, and database instances, significantly complicating the request path.

HAProxy solves this problem by streamlining the request path through applications and load balancers. Our solutions adapt to the complexities of Kubernetes environments, making it easier to handle the challenges of large-scale load balancing in AWS.

Simplification and scaling

HAProxy stands out for its dual capability to simultaneously scale network infrastructure and simplify it.

The principle we operate on is straightforward:

Nothing is harder to scale than complexity; so to scale effectively, you need to simplify first.

We achieve simplicity by consolidating the request path across multiple supportive capabilities, such as a WAF, rate limiting, and access control list (ACL) functionality, into a single layer led by load balancing capability. This approach significantly reduces complexity, merging various load balancing-related capabilities into one cohesive HAProxy Enterprise layer.

This simplification is complemented by HAProxy's scalability, which efficiently handles traffic spikes and traffic growth while providing features such as EC2 auto-scaling. This blend of simplification and scalability makes HAProxy a powerful tool for handling the demands of complex network infrastructures.

Multi-layered security with HAProxy Enterprise

Integrating multiple functionalities into a single layer doesn't just simplify operations, it also enhances security. In environments where web traffic demands are high, HAProxy Enterprise stands out for its robust security measures.

The simple act of consolidating functions enhances security. There are fewer vendors to deal with, fewer interfaces between different services, and less traffic on the network.

HAProxy Enterprise employs advanced techniques such as IP anycast and Route 53 DNS for smart traffic routing, while also offering strong DDoS protection and rate limiting crucial for maintaining service availability and performance under attack - all within the load balancer.

The built-in next-generation HAProxy Enterprise WAF provides ultra-low latency protection against application attacks, with exceptional balanced accuracy that virtually eliminates the security impact of false negatives and the noise of false positives. At the same time, HAProxy Enterprise’s fingerprinting and bot management capabilities help identify and mitigate sophisticated attacks. Rate limiting and IP-based access control further bolster security, managing traffic flow and ensuring that only authorized access to web resources is allowed.

This comprehensive suite of features ensures protection in demanding web traffic environments, all of which contribute to reduced latency and faster processing speeds.

Flexible, customizable, and extensible

HAProxy is not a fixed solution; it can be fine-tuned to fit the demands of any network environment.

Use case: rate limiting across VPC regions using VPC ID identification

Many AWS users face challenges when managing web traffic across different virtual private clouds (VPCs) in various regions. VPCs with overlapping IP addresses make it difficult to identify the origin of incoming traffic. Traditional load balancing approaches, which typically rely on the source IP address for rate limiting, are inadequate.

HAProxy Enterprise provides a solution, using the VPC ID to identify each request. This is made possible by leveraging a feature from AWS's Network Load Balancer (NLB) that transmits the VPC ID alongside the traffic, utilizing the PROXY protocol. By rate limiting based on the VPC ID, HAProxy Enterprise achieves a more accurate and efficient traffic management system, overcoming the challenges posed by overlapping IP addresses. (The basics of rate limiting on AWS are described in our blog post.)

HAProxy's architecture allows for high customizability and flexibility, making it adaptable to a wide range of network environments. Unlike rigid, one-size-fits-all solutions, HAProxy can be precisely tailored to meet the specific demands and challenges of any setup, ensuring optimal performance and security tailored to each unique scenario.

Centralized management with HAProxy Fusion Control Plane

The HAProxy Fusion Control Plane provides centralized management, monitoring, and automation for multiple clusters of HAProxy Enterprise instances in a distributed load balancer layer. It simplifies the task of overseeing numerous load balancers spread across different regions by enabling auto-scaling and facilitating automatic configuration inheritance.

This centralization reduces the burden of management, promotes consistency, and enhances the efficiency of traffic handling across the network.

Promoting infrastructure flexibility and observability

HAProxy strongly advocates the concept of infrastructure immutability, emphasizing the idea of replacing components rather than upgrading or repairing them. This approach, similar to modern practices observed in Kubernetes environments, ensures that the infrastructure remains agile and capable of quick adaptation to changing needs without being bogged down by legacy issues.

In addition to promoting flexibility, HAProxy plays a crucial role in enhancing the observability of complex systems. It aids in the easier tracking of traffic flows and provides insights into system performance in real-time. This level of observability is essential for maintaining an efficient and responsive infrastructure, allowing for immediate identification and resolution of issues in real-time.

HAProxy is a game-changer in AWS load balancing

HAProxy goes beyond being just a load-balancing tool. It serves as a complete solution to the complex challenges faced by modern web infrastructures. By effectively managing traffic at scale, ensuring robust security, simplifying traffic flows, and fostering agile infrastructure, HAProxy Enterprise and HAProxy Fusion present a game-changing option for AWS users.

Watch Jakub’s talk on "Scalable load balancing and security on AWS with HAProxy Fusion" from AWS re:Invent 2023 to dive deeper into how HAProxy is revolutionizing AWS load balancing.

]]> Scalable AWS Load Balancing and Security With HAProxy Fusion appeared first on HAProxy Technologies.

July 2024 – CVE-2024-6387: RCE in OpenSSH's server

HAProxy Technologies — Mon, 08 Jul 2024 11:00:00 +0000

]]> The latest versions of our products fix a vulnerability related to OpenSSH’s server (sshd), which is used in the public/private cloud images of HAProxy Enterprise and the hardware/virtual appliances of HAProxy ALOHA.

A vulnerability in sshd’s SIGALRM handler permits unauthenticated remote code execution as root. This allows remote attackers to cause a denial of service (DoS), and possibly execute arbitrary code.

If you are using an affected product, you should upgrade to the fixed version as soon as possible. There is no workaround available.

Affected Versions & Remediation

HAProxy Technologies released new versions of HAProxy Enterprise and HAProxy ALOHA on Thursday, 4 July 2024. These releases patch the vulnerability described in CVE-2024-6387 (CVSSv3 score of 8.1).

Users of the affected products should upgrade to the fixed version as soon as possible by following the instructions below.

Update HAProxy Enterprise public/private cloud images using your Linux distribution’s regular package management operation, for example by using apt or yum
Update HAProxy ALOHA

Amazon AMIs and Azure VHDs are available.

Affected version	Fixed version
HAProxy ALOHA 16.0	16.0.2
HAProxy ALOHA 15.5	15.5.12
HAProxy ALOHA 14.5	14.5.23
HAProxy Enterprise public / private cloud images based on rhel9, ubuntu 22.04 and 24.04, debian 12	Any version published on or after 2024-07-04

Support

If you are an HAProxy Enterprise or HAProxy ALOHA customer and have questions about upgrading to the latest version, please get in touch with the HAProxy support team.

]]> July 2024 – CVE-2024-6387: RCE in OpenSSH's server appeared first on HAProxy Technologies.

July 2024 – CVE-2024-24791: HTTP/1.1 response code mishandling in golang products

HAProxy Technologies — Mon, 08 Jul 2024 11:00:00 +0000

]]> The latest versions of our products fix a vulnerability related to HTTP/1.1 response code mishandling in products written in golang. This affects multiple HAProxy Technologies products.

CVE-2024-24791 exposes a denial of service (DoS) vulnerability in Go's net/http client. The client misinterprets a server's "Expect: 100-continue" header with a non-informational response (like a 200 OK). This leaves the connection unusable, causing subsequent requests to fail. Attackers can exploit this by sending "Expect: 100-continue" requests to overwhelm the proxy with unusable connections.

If you are using an affected product, you should upgrade to the fixed version as soon as possible. There is no workaround available.

Affected Versions & Remediation

HAProxy Technologies released new versions of HAProxy Fusion, HAProxy Enterprise Verify Crawler Module, HAProxy ALOHA, HAProxy Kubernetes Ingress Controller, HAProxy Enterprise Kubernetes Ingress Controller, Data Plane API, and Data Plane API Enterprise on Thursday, 4 July 2024. These releases patch the vulnerability described in CVE-2024-24791 (CVSSv3 score of 7.5).

Users of the affected products should upgrade to the fixed version as soon as possible by following the instructions below.

Affected version	Fixed version
HAProxy Fusion 1.2	1.2.32
HAProxy Fusion 1.1	1.1.15
HAProxy Fusion 1.0	1.0.22
HAProxy Fusion fusionctl	hapee-fusion-fusionctl-release-fusion-13.0 1.0.0-13.0
HAProxy Enterprise Verify Crawler Module	hapee-verify-crawler-release-extras-25.3 1.1-25.3
HAProxy ALOHA Management Package 16.0	16.0-1.0.4
HAProxy ALOHA Management Package 15.5	15.5-1.0.16
HAProxy ALOHA Management Package 14.5	14.5-1.0.20
HAProxy ALOHA Management Package 13.5	13.5-1.0.22
HAProxy Kubernetes Ingress Controller 3.0	3.0.1
HAProxy Kubernetes Ingress Controller 1.11	1.11.6
HAProxy Kubernetes Ingress Controller 1.10	1.10.16
HAProxy Enterprise Kubernetes Ingress Controller 1.11	1.11.6-ee1
HAProxy Enterprise Kubernetes Package 2.8	hapee-kubernetes-ingress-release-2.8r1-17.0 1.0.0-17.0
HAProxy Enterprise Kubernetes Package 2.6	hapee-kubernetes-ingress-release-2.6r1-20.0 1.0.0-20.0
HAProxy Enterprise Kubernetes Package 2.4	hapee-kubernetes-ingress-release-2.4r1-21.0 1.0.0-21.0
Data Plane API 2.9	2.9.5
Data Plane API 2.8	2.8.9
Data Plane API 2.7	2.7.13
Data Plane API Enterprise 2.9	hapee-dataplaneapi29-release-extras-179.0 2.9.4-179.0
Data Plane API Enterprise 2.8	hapee-dataplaneapi28-release-extras-187.0 2.8.8-187.0
Data Plane API Enterprise 2.6	hapee-dataplaneapi26-release-extras-161.0 2.6.5-161.0

Support

If you are a customer and have questions about upgrading to the latest version, please get in touch with the HAProxy support team.

]]> July 2024 – CVE-2024-24791: HTTP/1.1 response code mishandling in golang products appeared first on HAProxy Technologies.

Reviewing Every New Feature in HAProxy 3.0

Nick Ramirez, Ashley Morris and Alban Lebrun — Mon, 01 Jul 2024 09:42:00 +0000

]]> HAProxy 3.0 maintains the strong momentum of our open-source load balancer into 2024 with improvements to simplicity, security, reliability, flexibility, and more. In this blog post, we'll dive into what’s new in version 3.0, providing examples and context. It’s a long list, so get comfortable and maybe bring a snack!

All these improvements (and more) will be incorporated into HAProxy Enterprise 3.0, releasing later this year.

Simplicity

New crt-store configuration section

The new crt-store configuration section provides a flexible way to store and consume SSL certificates. Replacing crt-list, crt-store separates certificate storage from their use in a frontend, and provides better visibility for certificate information by moving it from external files and placing it into the main HAProxy configuration. The crt-store section allows you to individually specify the locations of each certificate component, for example certificates files, key files, and OCSP response files. Aliases provide support for human-friendly names for referencing the certificates more easily on bind lines. The ocsp-update argument is now configured in a crt-store instead of a crt-list.

Consider the following example where a frontend references a crt-list for its certificate information, including OCSP response files. This is how you would use ocsp-update prior to crt-stores:

]]> blog20240626-01.cfg]]> The contents of crt-list.txt would be:

]]> blog20240626-02.txt]]> And contained in site1.pem would be the public certificate, any intermediate certificates, and the private key. Using ocsp-update on, HAProxy knows to look for an .ocsp file in the same directory as the certificates and keys with the same name (site1.ocsp).

Note that defining the certificate information in this way makes the information less visible, as it is not defined within the HAProxy configuration file.

To remedy this, use a crt-store section. To the HAProxy configuration file, add the following:

]]> blog20240626-03.cfg]]> Note that in this example, the certificates, keys, and OCSP response files are split into their own files, and may be referenced individually. We are specifying a crt-base; this is the directory where we will place the files.

Now, you can reference these certificate elements in a frontend by their alias:

]]> blog20240626-04.cfg]]> You must reference the certificates in a crt-store by using @/, which in this case is @web/site1. Note here that we also gave our certificates for site1 an alias, and are referencing them by the alias (site1). If you do not provide a name for your crt-store, you can reference its certificates like so: @/site1, leaving out the .

You can also use crt-store to specify keys separately from their certificates, as is the case in this example:

]]> blog20240626-05.cfg]]> In this case, for site2, there are separate crt and key files. We specify a crt-base, the location of the certificates, and a key-base, the location of the keys. Once you define your certificates and keys in a crt-store you can then reference multiple certificates on the bind line:

]]> blog20240626-06.cfg]]> You can also specify the crt-base and key-base in your global settings. Note that including crt-base or key-base in a crt-store take precedence over the global settings. The same is true for using absolute paths when specifying your certificates and keys.

Security

]]> Protocol glitches

Some HTTP/2 requests are valid from a protocol perspective but pose problems anyway. For example, sending a single header as a large number of CONTINUATION frames could cause a denial of service. HAProxy now counts these so-called glitches and allows you to set a limit on them. You can also track them in a stick table to identify buggy applications or misbehaving clients.

OpenSSL security level

A new global keyword ssl-security-level allows you to set globally, that is, on every HAProxy SSL context, the OpenSSL’s internal security level. This enforces the appropriate checks and restrictions per the level specified. Parameters specified, including cipher suite encryption algorithms, supported ECC curves, supported signature algorithms, DH parameter sizes, certificate key sizes, and signature algorithms, inconsistent with the level will be rejected. For more information see: OpenSSL Set Security Level.

Specify a value between 0 and 5 to set the level, where 5 is the most strict:

]]> blog20240626-07.cfg]]> Dependency on libsystemd removed

HAProxy’s security has been hardened by the removal of the dependency on the libsystemd library. This libsystemd library has many additional dependencies of its own, including the library at fault for the XZ Utils backdoor vulnerability. Removing the dependency on libsystemd means that HAProxy is not exposed to these undesired libraries.

Prevent HAProxy from accepting traffic on privileged ports

Two new global settings, harden.reject-privileged-ports.tcp for TCP connections and harden.reject-privileged-ports.quic for QUIC connections, enable HAProxy to ignore traffic when the client uses a privileged port (0 - 1023) as their source port. Clients using this range of ports are suspicious and such behavior often indicates spoofing, such as to launch DNS/NTP amplification attacks. The benefit in using these settings is that during DNS/NTP amplification attacks, CPU is reserved as HAProxy will drop the packets on these privileged ports instead of parsing them.

Reliability

Persist stats

Reloading HAProxy will no longer reset the HAProxy Stats page, as long as you call the new Runtime API command dump stats-file first to save the current state to a file and then load that file with the stats-file global configuration directive. Upon reload, the new processes set their counters to the previous values (as they appear in the stats file). Only proxy counters are supported which includes values for frontends, backends, servers, and listeners. Ensure that you've set a GUID on each frontend, backend, listen, and server object by using the new guid keyword. Only objects to which you have assigned a GUID will have their stats persisted across the reload.

To enable this behavior, follow these five steps:

Step 1

Assign GUIDs to each frontend, backend, listen, and server object in your configuration. Here is a simple configuration with one frontend that routes to one backend with one server. The frontend, backend, and server will each have a globally unique identifier assigned:

]]> blog20240626-08.cfg]]> Step 2

Reload HAProxy to apply your configuration changes:

]]> blog20240626-09.sh]]> Step 3

Call dump stats-file to create the stats file, redirecting the output to your desired location:

]]> blog20240626-10.sh]]> The data in the file will capture the current state of the objects.

Step 4

Add stats-file to your global settings with the path to your stats file location that will be reloaded:

]]> blog20240626-11.cfg]]> Step 5

Reload HAProxy. It will reload the stats counters from the file.

You can also generate GUIDs for items on bind lines. The guid-prefix should be used in this case to specify a string prefix for automatically generated GUIDs. Each listener on the bind line will have a GUID automatically generated for it with the prefix included. Here's an example:

]]> blog20240626-12.cfg]]> HTTP/2: Limit the number of streams per connection

You can limit the number of total streams processed per incoming connection for HTTP/2 using the global keyword tune.h2.fe-max-total-streams. Once this limit is reached, HAProxy will send a graceful GOAWAY frame informing the client that it will close the connection after all pending streams have been closed. Usually, this prompts clients to reestablish their connections. This is helpful in situations where there is an imbalance in the load balancing due to clients maintaining long-lived connections.

Diagnostic mode

Diagnostic mode (-dD) will now report on likely problematic ACL pattern values that look like known ACL/sample fetch keywords in addition to its other warnings about suspicious configuration statements. It does not prevent startup. This is helpful for troubleshooting problematic configurations.

Consistent hash server mapping

When load balancing using a hash-based algorithm (balance hash), HAProxy must keep track of which server is which. Instead of using numeric IDs to compute hash keys for the servers in a backend, the hash-key directive now supports using the servers’ addresses and ports to compute the hash keys. This is useful in cases where multiple HAProxy processes are balancing traffic to the same set of servers, as each independent HAProxy process will calculate the same hash key and therefore agree on routing decisions, even if its list of servers is in a different order.

The new hash-key directive allows you to specify how the node keys, that is, the hash for each server, are calculated. This applies to node keys where you have set hash-type to consistent. There are three options for hash-key:

id: the server’s numeric id if set, or its position in the server list
addr: the server’s IP address
addr-port: the server’s IP address and port

If you were to use hash-key id, and have no IDs explicitly set for your servers, the hashes could be inconsistent across load balancers if your server list were in different orders on each server. Using addr or addr-port solves this problem.

Consider the following backend:

]]> blog20240626-13.cfg]]> This backend will use the hash load balancing algorithm, calculating the hash using the sample expression pathq (the request’s URL path with the query-string). Since we specified the hash-type as consistent, and the servers’ hash-key as addr (the IP address of the server), the servers’ IP addresses are used in the hash key computation. As such, if we had multiple load balancers with the servers listed in different orders, the routing would be consistent across the load balancers.

If multiple instances of HAProxy are running on the same host, for example in Docker containers, using both the IP address and the port (hash-key addr-port) is appropriate for creating the hash key.

Memory limit

To limit the total allocatable memory to a specified number of megabytes across all processes, use -m . This is useful in situations where resources are constrained, but note that specifying a limit this way may cause connection refusals and slowdowns depending on memory usage requirements. The limiting mechanism now uses RLMIT_DATA instead of RLIMIT_AS, which prevents an issue where the processes were reaching out-of-memory conditions below the configured limit. In master-worker mode, the memory limit is applied separately to the master and its forked worker process, as memory is not shared between processes.

Unix sockets and namespaces

Version 3.0 supports the namespace keyword for UNIX sockets specified on bind and server lines. If a permissions issue arises, HAProxy will log a permissions error instead of “failed to create socket”, since the capability cap_sys_admin is required for using namespaces for sockets.

For example, to specify a UNIX socket with a namespace, use the namespace keyword on your server line:

]]> blog20240626-14.cfg]]> gRPC

With the gRPC protocol, the client can cancel communication with the server, which should be conveyed to the server so that it can clean up resources and perhaps invoke cancellations of its own to upstream gRPC services. Cancellations are useful for a few reasons. Often, gRPC clients configure deadlines so that a call will be canceled if it runs too long. Or a client might invoke a cancellation if it finishes early and no longer needs the stream. Read more about cancellation in the gRPC documentation.

Prior to HAProxy 3.0, cancellations, which are represented as RST_STREAM frames in HTTP/2 or as STOP_SENDING frames in HTTP/3, were not being properly relayed to the server. This has been fixed. Furthermore, HAProxy 3.0 adds new fetch methods that indicate whether the stream was canceled (aka aborted) and the error code. Below, we include them in a custom log format:

]]> blog20240626-15.cfg]]> Here's the output when the client cancels:

]]> blog20240626-16.txt]]> New global keyword: expose-deprecated-directives

If you want to use deprecated directives you must also use the expose-deprecated-directives global keyword which will silence warnings. This global keyword applies to deprecated directives that do not have alternative solutions.

Emergency memory allocation

The mechanism for emergency memory allocation in low memory conditions has been improved for HTTP/1 and applets. Forward progress in processing is guaranteed as tasks are now queued according to their criticality and which missing buffer they require.

Flexibility

]]> Write HAProxy logs as JSON or CBOR

You can now format log lines as JSON and CBOR. When configuring a custom log format, you will indicate which to use, and then in parentheses set the key for each field.

Here is an example for JSON:

]]> blog20240626-17.cfg]]> This generates messages in the log file that looks like this:

]]> blog20240626-18.json]]> If you switch that line to start with %{+cbor} instead of %{+json}, then the generated messages will look like this:

]]> blog20240626-19.cbor]]> Virtual and optional ACL and Map files

In containerized environments, sometimes you don't want to deal with attaching a volume and mapping that volume to the container's filesystem, especially when it comes to storing data that is dynamic. In other words, wouldn't it be nice to remove the need to attach a volume? With HAProxy 3.0, you can now work with virtual ACL and map files. By prefixing your ACL and Map files with virtual@, HAProxy won't search the filesystem for the files but rather creates in-memory representations of them only.

The configuration below sets the stage for adding IP addresses to a denylist, where that denylist is virtual:

]]> blog20240626-20.cfg]]> Then to add an address to the denylist, use the Runtime API. Below, we deny the IP address 172.20.0.1:

]]> blog20240626-21.sh]]> You can also prefix the filename with opt@, which marks the file as optional. In that case, HAProxy will check for the file on the filesystem, but if it doesn't find it, will assume it is virtual. That's useful for later saving the contents to a file so that they persist across reloading HAProxy.

Report the key that matched in a map file

A request matched a row in a map file, but which row? This version of HAProxy adds several, new map_*_key converters that return the key that was matched, rather than the associated value, making it easier to view the reason why the load balancer rerouted a request or took some other action.

In the example below, we use a Map file for path-based routing, where the request's URL path determines which backend to send a request to. By using a converter that ends in _key, in this case map_beg_key, which will match the beginning of the key in the file and then return the key, we can record which key in the Map file matches the request:

]]> blog20240626-22.cfg]]> Let's assume that paths.map looks like this:

]]> blog20240626-23.txt]]> Then when a user requests a URL that begins with /api, they'll be sent to the backend named apiservers. When they request a URL that begins with /cart, they'll be sent to the backend named cartservers. If a request matches neither, it will record a dash in the logs. Our log shows this:

]]> blog20240626-24.txt]]> Explicitly set default TLS certificates

HAProxy can proxy traffic for multiple websites through the same frontend. To choose the right TLS certificate, it will compare the TLS Server Name Indication (SNI) value of the incoming connection with the certificates it finds in the certificates directory. If no certificate in your directory matches, it will resort to using a default one, which is the certificate that was loaded first (was first in the directory). But what if you wanted to set multiple defaults? For example, suppose you wanted to default to an ECDSA certificate, if supported, otherwise default to an RSA certificate? Now you have multiple ways to do this:

Ensure that the first file in the directory is a multi-cert bundle.
Use the new default-crt argument on a bind line.
If using crt-list, indicate a default certificate by marking it with an asterisk.

To demonstrate setting one or more default-crt arguments on the frontend's bind line, below we set crt to a directory so that HAProxy will select the correct certificate from that directory based on SNI. But if there is no match, it will instead use one of the default files—either ECDSA or RSA:

]]> blog20240626-25.cfg]]> Track specific HTTP errors

Until now, you could capture in a stick table the count and rate of client HTTP errors (4xx status codes) and server HTTP errors (5xx status codes), but you could not control specifically which status codes were included. This version adds the global directives http-err-codes and http-fail-codes that let you set the status codes you care about, allowing you to ignore those that don't matter to you. This works for responses from backend servers and for responses from HAProxy.

Tracking client HTTP errors can be useful for discovering misconfigured client applications, such as those that repeatedly use the wrong credentials or that make an unusually high number of requests. Below, we configure a stick table to track only 403 Forbidden errors, but you can also configure HAProxy to track a range of status codes by separating them with commas or indicating a range of codes with a dash:

]]> blog20240626-26.cfg]]> Then we use the Runtime API command show table to see the number of 403 errors from each client. Here, the client that has the source IP address 172.19.0.10 has had 20 errors:

]]> blog20240626-27.sh]]> Error counts show you client-side issues, such as requesting a missing page or a forbidden page. On the other hand, you can set the global directive http-fail-codes to track server HTTP errors, such as 500 Internal Server Error and 503 Service Unavailable. Use it with a stick table that tracks http_fail_rate or http_fail_cnt to track server-side failure rates and counts.

Stick table pointers

Previously, you could use the Runtime API commands clear table [table] [key] and set table [table] [key] to remove or update a record in a stick table based on its key. You can now pass the record's unique ID (its pointer) to remove it or update it. You get the pointer from the show table [table] command.

In the example below, the pointer is 0x7f7de4bb50d8:

]]> blog20240626-28.sh]]> To update the record, use its pointer. Below we set the HTTP request count to zero:

]]> blog20240626-29.sh]]> Similarly, to remove the record, use its pointer:

]]> blog20240626-30.sh]]> Linux capabilities

Version 2.9 added the ability to preserve previously set Linux capabilities after the HAProxy process starts up. Linux capabilities are permissions granted to a process that it would not otherwise have when running as a non-root user. These permissions become relevant since HAProxy does not run as a root user after startup.

In version 3.0, HAProxy is smarter about checking capabilities and will no longer emit error messages regarding the need for root permissions when running in transparent proxying mode or when binding to a port below 1024 (as is the case when using QUIC), so long as cap_net_admin is set. Additionally, file-system capabilities can now be set on the binary and if you start HAProxy as a root user, then adding setcap in the configuration is enough for adding those capabilities to the process. HAProxy will move the capabilities set on the binary (Permitted set) to its Effective set as long as the capabilities are also present in the setcap keyword list.

As a refresher, you can indicate which Linux capabilities to preserve after startup by using the setcap global keyword, specifying capabilities with a comma between each one:

]]> blog20240626-31.cfg]]> Note that due to some security restrictions set in place by modules such as SELinux or Seccomp, HAProxy may be unable to set the required capabilities on the process. In this case, you must be also set the capabilities from the command line on the HAProxy binary:

]]> blog20240626-32.sh]]> Available capabilities that you can preserve using setcap include:

cap_net_admin: Used for transparent proxying mode and for binding to privileged ports (lower than 1024, for example, for QUIC).
cap_net_raw (subset of cap_net_admin): Used for setting a source IP address that does not belong to the system itself, as is the case with transparent proxying mode.
cap_net_bind_service: Used for binding a socket to a specific network interface. This is required when using QUIC and binding to a privileged port.
cap_sys_admin: Used for creating a socket in a specific network namespace.

Set fwmark on packets to clients and servers

With HAProxy, you can set the fwmark on an IP packet, which classifies it so that, for example, it can use a specific routing table. HAProxy 3.0 now supports setting an fwmark on connections to backend servers as well as to clients connected on the frontend. Use the set-fc-mark and set-bc-mark actions. These replace the set-mark action, which had applied only to frontends.

To test this, first, give HAProxy the cap_net_admin capability, which is required for setting marks, by adding the setcap directive to the global section of your configuration:

]]> blog20240626-33.cfg]]> We'd like to mark packets coming from the servers backend and ensure that they always go out through a specific network interface. Let's set an fwmark with a value of 2 (an arbitrary number) for the current connection. We hardcode the value, but you can also use an expression of fetch methods and converters to set it:

]]> blog20240626-34.cfg]]> Now that we're marking packets, we just need to tell the network stack to route those packets through the desired interface, which is eth2 in this case:

]]> blog20240626-35.sh]]> To verify that the mark was set and that traffic is using the eth2 interface, you can use iptables to log the traffic:

]]> blog20240626-36.sh]]> Watch the kernel log. It shows OUT=eth2 and MARK=0x2:

]]> blog20240626-37.sh]]> Set traffic priority

HAProxy can modify the header of an IP packet to include the Differentiated Services (DS) field. This field classifies the packet so that the network stack can prioritize it higher or lower in relation to other traffic on the network. New in this version of HAProxy, you can set this field on connections to backend servers in addition to frontend connections to clients. To set the value, use the set-fc-tos and set-bc-tos actions (referring to the old Type of Service (TOS) header field, which has been superseded by DS).

First, give the HAProxy binary the cap_net_admin capability, which is required for setting network priority:

]]> blog20240626-38.sh]]> Then in the HAProxy configuration file, add the setcap directive to preserve that capability after HAProxy drops root privileges:

]]> blog20240626-39.cfg]]> We would like to prioritize video traffic on the backend, so we set the set-bc-tos directive to 26. You can learn more about DSCP in RFC4594, and find common values on the IANA DSCP webpage. Although we're hardcoding the value, you can also use an expression of fetch methods and converters to set it:

]]> blog20240626-40.cfg]]> New sample fetches

New fetch methods introduced in this version expose data that you can use in ACL expressions. They include fetches that return the number of open HTTP streams for a backend or frontend, the size of the backend queue, the allowed number of streams, and a value that indicates whether a connection was redispatched because a server became unreachable.

bc_be_queue – The number of streams de-queued while waiting for a connection slot on the target backend
bc_glitches – The number of protocol glitches counted on the backend connection
bc_nb_streams – The number of streams opened on the backend connection
bc_srv_queue – The number of streams de-queued while waiting for a connection slot on the target server
bc_settings_streams_limit – The maximum number of streams allowed on the backend connection
fc_glitches – The number of protocol glitches counted on the frontend connection
fc_nb_streams – The number of streams opened on the frontend connection
fc_settings_streams_limit – The maximum number of streams allowed on the frontend connection
txn.redispatched – True if the connection has experienced redispatch

Weights in log backends

Log backends were introduced in version 2.9. They allow you to set mode log in a backend to load balance the Syslog protocol. You can connect to backend Syslog servers over TCP or UDP.

In version 3.0, you can now set weights on server lines in your mode log backends. The example below demonstrates a log backend that uses weights:

]]> blog20240626-41.cfg]]> Here, HAProxy listens for incoming log messages on TCP and UDP ports 514. As a simple setup, you can run an NGINX web server as a Docker container, setting up the Docker daemon to forward the container's logs to HAProxy by saving the following as /etc/docker/daemon.json:

]]> blog20240626-42.json]]> Then run the NGINX container and watch the logs come through on your Syslog servers as you make web requests:

]]> blog20240626-43.sh]]> Because we've set weights on the servers in HAProxy, the servers will receive different amounts of log traffic.

Support for UUIDv7 identifiers

You can now generate universally unique identifiers that use the UUIDv7 format. UUIDs of this type factor in the current UNIX timestamp and are therefore time sortable, which tells you when a UUID was generated in relation to other UUIDs. In the example below, we set the unique ID format to version 7 and use the unique-id fetch method to get a new UUID to include in the logs:

]]> blog20240626-44.cfg]]> Log messages will look like this, where the first field is the UUID:

]]> blog20240626-45.json]]> HTTP forward proxy for OCSP updates

HAProxy 2.8 introduced a new and simpler way to configure OCSP stapling. HAProxy periodically contacts your SSL certificate issuer's OCSP server to get the revocation status of your SSL certificate. In version 3.0, if you're operating in an air-gapped environment where the HAProxy server does not have direct access to the Internet, and therefore can't connect to the OCSP server, you can set an HTTP forward proxy to reach the Internet. Add the ocsp-update.httpproxy global directive to indicate the proxy's address:

]]> blog20240626-46.cfg]]> Then the HTTP forward proxy can relay the OCSP request to the OCSP server.

Reverse HTTP

Reverse HTTP, which is an experimental feature added in HAProxy 2.9, allows HAProxy load balancers to self-register with an edge HAProxy load balancer and to then fill in as backend servers. In version 2.9, the mechanism for matching client requests with the correct backend server / HAProxy instance was via SNI. A new directive named pool-conn-name provides more flexibility, allowing you to set the name to match with an expression of fetch methods and converters.

Observability

]]> Prometheus

When configuring the Prometheus exporter, you can now include a URL parameter named extra-counters, which enables additional counters that provide information related to the HTTP protocol, QUIC, and SSL. Set this in your Prometheus server's prometheus.yml file, replacing haproxy with your load balancer's IP address:

]]> blog20240626-47.txt]]> Decrypt TLS 1.3 packet captures to backend servers

When diagnosing network issues, you will often need to analyze TLS-encrypted traffic to see the underlying application-layer protocol messages. For example, when using Wireshark, you can import a packet capture file that contains the traffic you would like to analyze, but then you need a way to decipher the captured packets. The most common way to do this is to import a key log file into Wireshark, which contains the secrets needed to decipher a TLS session.

While prior versions of HAProxy supported producing a key log file for deciphering traffic between the client and HAProxy, HAProxy 3.0 adds the ability to produce a key log file for deciphering TLS traffic between HAProxy and backend servers, specifically when using TLS 1.3.

Follow these seven steps:

Step 1

In your HAProxy configuration, set tune.ssl.keylog to on in the global section. This activates the retrieval of the TLS keys you will use for decryption in Wireshark:

]]> blog20240626-48.cfg]]> Step 2

Force HAProxy and the backend servers to use TLS 1.3 by adding the ssl-min-ver argument to the servers:

]]> blog20240626-49.cfg]]> Step 3

Define a custom log format that writes TLS session secrets to the access log:

]]> blog20240626-50.cfg]]> Step 4

After HAProxy connects to a backend server, the access log will contain the keys for that TLS session. The access log will contain lines like this:

]]> blog20240626-51.txt]]> Step 5

Copy these lines to a text file. Then import the file into Wireshark via Edit > Preferences > Protocols > TLS > (Pre)-Master-Secret log filename.

Step 6

At the same time, capture traffic between HAProxy and the backend servers. For example, you can run tcpdump on the HAProxy server to get a PCAP file:

]]> blog20240626-52.sh]]> Step 7

Open this PCAP file in Wireshark to see the deciphered traffic.

Runtime API

show quic verbosity

The Runtime API's show quic command gained finer-grained levels of verbosity and the ability to filter the output to a specific connection. Previously, the verbosity levels you could specify were online or full. Now you can also specify a comma-delimited list of values that determine the output. Values include tp, sock, pktns, cc, and mux. Also, in this version you can specify the connection's hexadecimal address to see information for just that connection.

Here are examples that show output for the new verbosity levels:

]]> blog20240626-53.sh]]> Cookies for dynamic servers

Static cookies for session persistence are now supported for dynamically added servers. Dynamic servers refer to servers that do not have an explicit entry within your HAProxy configuration file. They are dynamic in the sense that you can add them programmatically using Runtime API commands. Dynamic servers are valuable in cases where you may have many servers that scale with traffic: when traffic loads are high, you add more servers, and when traffic loads diminish, you remove servers.

Previous versions of HAProxy did not support adding these dynamic servers and also using static cookies with those servers, but as of version 3.0, you can now use the add server command to add the server and specify its static cookie using just one command. Note that when adding a dynamic server, you must choose a load balancing algorithm for your backend that is dynamic (roundrobin, leastconn, or random).

To add a dynamic server to your backend with a static cookie, issue the add server command, specifying your cookie:

]]> blog20240626-54.sh]]> Here's the output:

]]> blog20240626-55.txt]]> You can also enable auto-generated names for session persistence cookies. For more information see our guide for setting a dynamic cookie key In that case, if you set the cookie argument on your add server command, the static cookie you specify will take precedence over the backend’s setting for dynamic cookies.

As a reminder, servers added via the Runtime API are not persisted after a reload. To ensure that servers you add via the Runtime API persists after a reload, be sure to also add them into your HAProxy configuration file (thus making them static servers).

Del server

Removing a dynamic server with the del server command is faster now that the command can close idle connections, where previously it would wait for the connections to close by themselves. This improvement is made possible by changes to the removal mechanism which allows forceful closure of idle connections.

Wait command

The new wait command for the Runtime API will wait for some specified time before then performing the following command.You could use this to collect metrics on a certain interval. This is also useful in cases where you need to wait until a server becomes removable (the server has been drained of connections) before running additional commands, such as del server.

The syntax for the command is: wait { -h | } [ [...]]

If you do not provide any conditions, the command will simply wait for the requested delay (in default milliseconds) time before it continues processing.

With this release, the only supported condition is srv-removable which will wait until the specified server is removable. When using socat, be sure to extend socat’s timeout to account for the wait time.

The following example calls show activity, waits 10 seconds, then calls show activity again. Note that socat’s timeout value has been increased to 20 seconds:

]]> blog20240626-56.sh]]> Here's the output:

]]> blog20240626-57.txt]]> This example disables the server named app/app1, calls shutdown sessions for the server, waits for the server to be removable (using the condition srv-removable), and then once removable, deletes the server:

]]> blog20240626-58.sh]]> Finally, here's the output:

]]> blog20240626-59.txt]]> Performance]]> Fast forwarding

Zero-copying forwarding was introduced in Version 2.9 for TCP, HTTP/1, HTTP/2, and HTTP/3. As of version 3.0, applets, such as the cache, can also take advantage of the fast forwarding mechanism which avoids queuing more data when the mux buffer is full. This results in significantly less memory usage and higher performance. This behavior can be disabled by using tune.applet.zero-copy-forwarding for applets only, or tune.disable.zero-copy-forwarding globally.

In regards to QUIC, simplification of the internal send API resulted in removal of one buffer copy. The fast forwarding now considers the flow control, which reduces the number of thread wakeups and optimizes packet filling.

The HTTP/1 mutex now also supports zero-copy forwarding for chunks of unknown size. For example, chunks whose size may be larger than the buffer.

Ebtree update

Performance improved for ebtree on non-x86 machines. This results in approximately 3% faster task switching and approximately 2% faster string lookups.

Server name lookup

Configuration parsing time will see an improvement thanks to a change in server name lookups. The lookups now use a tree, which improves lookup time, whereas before the lookup was a linear operation.

QUIC

QUIC users will see a performance increase when using two new global settings:

tune.quic.reorder-ratio
- By adjusting tune.quic.reorder-ratio, you can change how quickly HAProxy detects packet losses. This setting applies to outgoing packets. When HAProxy receives an acknowledgement (ACK) for a packet it sent to the destination and that ACK arrived before other expected ACKs, or in other words it arrived out of sequence, it could indicate that packets never reached the destination. If it happens frequently, it indicates a poor network condition. By lowering the ratio, you're lowering the number of packets that can be missing ACKs before HAProxy takes action. That action is to reduce the packet sending window, which forces HAProxy to slow down its rate of transfer, at the cost of slower throughput. The default value is 50%, which means that the latest ACKed packet was halfway up the range of sent packets awaiting acknowledgements, with packets preceding it not yet ACKed.
tune.quic.cc-hystart
- Use this setting to enable use of the HyStart++ (RFC9406) algorithm instead of the Cubic algorithm. This provides an alternative to the TCP slow start phase of the congestion control algorithm. This algorithm may show better recovery patterns regarding packet loss.

Additionally, the send path for QUIC was improved by cleanup on some of the low level QUIC sending functions. This includes exclusive use of sendmsg() (a system call for sending messages over sockets), optimizations avoiding unnecessary function calls, and avoiding copies where possible.

Traces

An improvement to traces will enable their use on servers with moderate to high levels of traffic without risk of server impact.

The improvement to traces was made possible by the implementation of near lockless rings. Previously, the locking mechanism limited the possibility of using traces in a production environment. Now that the rings are nearly lockless, allowing for parallel writes per group of threads, performance with traces enabled has been increased up to 20x.

A new global setting tune.ring.queues that sets the number of write queues in front of ring buffers can be used for debugging, as changing the value may reveal important behavior during a debugging session. This should only be changed if you are instructed to do so for troubleshooting.

Stick tables

Stick tables have received a performance boost due to a change in the locking mechanism. Previously, when the number of updates was high, stick tables could cause CPU usage to rise, due to the overhead associated with locking. Using peers amplified this issue. Stick tables are now sharded over multiple, smaller tables, each having their own lock, thus reducing lock contention. Also, the interlocking between peers and tables has been significantly reduced.

This means that on systems with many threads, stick table performance improves greatly. On a system with 80 threads, we measured performance gains of approximately 6x. As for systems with low thread counts, performance could be improved by as much as 2x when using peers.

Lua

Single-threaded Lua scripts using lua-load will see a performance improvement. This improvement is the result of a change to the loading mechanism, where the maximum number of instructions is now divided by the number of threads. This makes it so that waiting threads have a shorter wait time and share the time slot more evenly. Safeguards are in place to prevent thread contention for threads waiting for the global Lua lock.

Use tune.lua.forced-yield to tune the thread yielding behavior. For scripts that use lua-load, the optimal (and default) value was found to be the maximum of either 500 instructions or 10000 instructions divided by the number of threads. As for scripts loaded using lua-load-per-thread, in cases where more responsiveness is required, the value can be lowered from the default of 10000 instructions. In cases where the results of the Lua scripts are mandatory for processing the data, the value can be increased, but with caution, as an increase could cause thread contention.

Breaking Changes

Multiple CLI commands no longer supported

Previously, it was occasionally possible to successfully issue multiple commands by separating them with newlines, which had the potential to produce unexpected results for long-running commands that may only partially complete. A warning will now be emitted when a \n is detected in a command and the command will not be accepted. This change has also been backported to ensure that user scripts that utilize this behavior can be remedied.

Enabled keyword rejected for dynamic servers

When defining a dynamic server, use of the enabled keyword is now rejected with an error, whereas previously it was only silently ignored. Here's a sample input:

]]> blog20240626-60.sh]]> This produces the following output:

]]> blog20240626-61.txt]]> Invalid request targets rejected for non-compliant URIs

Parsing is now more strict during H1 processing for request target validation. This means that where previously, for compatibility, non-standard-compliant URIs were forwarded as-is for HTTP/1, now some invalid request targets are rejected with a 400 Bad Request error. The following rules now apply:

The asterisk-form is now only allowed for OPTIONS and OTHER methods. There must now be only one asterisk and nothing more.
The CONNECT method must have a valid authority-form. All other forms are rejected.
The authority-form is now only supported for the CONNECT method. Origin-form is only checked for the CONNECT method.
Absolute-form must have a scheme and a valid authority.

Tune.ssl.ocsp-update renamed to oscp-update

The tune.ssl.oscp-update global keyword is now named oscp-update, as ocsp-update is unrelated to SSL tuning.

Development Improvements

This release brings with it some major improvements for developers and contributors, as well as aids in saving time diagnosing issues and speeding up recovery:

The internal API for applets has been simplified, with new applet code having its own buffers, and keyword handlers for the Runtime API now have their own buffers as well.
Updates to the makefile improve ease of use for packagers, including improved warnings plus easier passing of CFLAGS and LDFLAGS. Unused build options will produce a warning which will assist in debugging typos for build options with long names.
A new debugging feature has been added to the SSL and HTTP cache that allows assignment of a name to some memory areas so that it is more easily identified in the process map (using /proc/$pid/maps or using pmap on Linux versions 5.17 or greater). This makes it so that you can more easily determine where and why memory is being used. Future iterations will include more places where this debugging feature is implemented, further improving troubleshooting.
By default HAProxy tries hard to prevent any thread and process creation after it starts. This is particularly important when running HAProxy’s own test suite, when executing Lua files of uncertain origin, and when experimenting with development versions, which may still contain bugs whose exploitability is uncertain. Generally speaking, it's a best practice to make sure that no unexpected background activity can be triggered by traffic. However, this may prevent external checks from working, and it may break some very specific Lua scripts which actively rely on the ability to fork. This global option insecure-fork-wanted disables this protection. As of version 3.0, you can also activate this option by using -dI (-d uppercase “i”) on the HAProxy command line. Note that it is a bad idea to disable it, as a vulnerability in a library or within HAProxy itself may be easier to exploit once disabled. In addition, forking from Lua, or anywhere else, is not reliable, as the forked process could embed a lock set by another thread and cause operations to never cease execution. As such, we recommend that you use this option with extreme caution, and that you move any workload requiring such a fork to a safer solution (such as using agents instead of external checks).
The DeviceAtlas module has been updated to support the new version of DeviceAtlas.
Support for FreeBSD 14 (and its new sched_setaffinity() system call) has been added.

Conclusion

HAProxy 3.0 was made possible through the work of contributors that pour immense effort into open-source projects like this one. This work includes participating in discussions, bug reporting, testing, documenting, providing help, writing code, reviewing code, and hosting packages.

While it's sadly impossible to include every contributor name here, all of you are invaluable members of the HAProxy community! Thank you.

]]> Reviewing Every New Feature in HAProxy 3.0 appeared first on HAProxy Technologies.

Announcing HAProxy Kubernetes Ingress Controller 3.0

Zlatko Bratkovic and Hélène Durand — Tue, 25 Jun 2024 09:44:00 +0000

]]> HAProxy Kubernetes Ingress Controller 3.0 is now available. For our enterprise customers, HAProxy Enterprise Kubernetes Ingress Controller 3.0 will arrive later this year and incorporate these same features. In this release, we've added TCP custom resource definitions (CRDs) to improve mapping, structuring, and validation for TCP services within HAProxy Kubernetes Ingress Controller. Say goodbye to messy service list management and "hello" to greater flexibility with HAProxy options for your K8s services.

In this blog post, we'll share a quick note on updated naming conventions before diving deeper into HAProxy Kubernetes Ingress Controller's headlining features.

Version compatibility with HAProxy

HAProxy Kubernetes Ingress Controller 3.0 is built with HAProxy version 3.0 and has now jumped from version 1.11 to version 3.0. Starting with this release, Kubernetes Ingress Controller's version number will match the version of HAProxy it uses. We hope this clarifies the link between HAProxy Kubernetes Ingress Controller and its baseline version of HAProxy, moving forward.

Custom Resource Definitions: TCP

Until now, mapping for TCP services was available through a custom ConfigMap using the --configmap-tcp-services flag. While this worked as expected, there were a few limitations we needed to address.

For example, ConfigMap alone doesn't have a standardized structure nor validation. Therefore, keeping a larger list of services tidy can be challenging. Additionally, only some HAProxy options (such as service, port, and SSL/TLS offloading) were available for those types of services.

The tcps.ingress.v1.haproxy.org definition, conversely, lets us define and use more HAProxy options than we could with ConfigMap.

Installing and getting to know TCP CRDs

If you're using Helm, the TCP services definition will be installed automatically. Otherwise, it's available as a raw YAML file via GitHub.

TCP custom resources (CRs) are namespaced and you can deploy several of them in a shared namespace. HAProxy will apply them all.

A TCP CR contains a list of TCP service definitions. Each service definition has:

A name
A frontend section containing two permitted components:
- Any setting from client-native frontend model
- A list of binds coupled with any settings from client-native bind models
A service definition that's a Kubernetes upstream Service/Port (the K8s Service and the deployed TCP CR must be in the same namespace).

Here's a simple example of a TCP service:

]]> blog20240627-01.yml]]> How do we configure service and backend options? You can use the Backend Custom Resource (and reference it in the Ingress Controller ConfigMap, Ingress, or the Service) in conjunction with the TCP CR.

Mitigating TCP collisions

TCP services are tricky since they allow for unwanted naming and configuration duplications. This overlap can cause transmission delays and other performance degradations while impacting reliability.

Luckily, HAProxy can detect and manage two types of collisions:

Collisions on frontend names
Collisions on bind addresses and ports

If several TCP services across all namespaces encounter these collisions, HAProxy will only apply the one that was created first based on the older CreationTimestamp of the custom resource. This will generate a message in the log.

SSL/TLS in a TCP custom resource

Here's a quick example of a TCP service with SSL/TLS enabled:

]]> blog20240627-02.yml]]> Keep in mind that ssl_certificate can be the following:

The name of a Kubernetes Secret (in the same namespace as the TCP CR) containing the certificate and key
A folder or filename on the pod's local filesystem, which was mounted as a Secret Volume

For example, you can mount an SSL/TLS Secret in the Ingress Controller Pod on a volume and reference the volume mount path in ssl_certificate. Without changing the Pod (or deployment manifest), you can instead use a Secret name within the ssl_certificate configuration. As a result, the certificate and key will be written in the Pod's filesystem at the etc/haproxy/certs/tcp path.

]]> Custom Resource Definitions (CRDs): transitioning from alpha versions]]> In HAProxy Kubernetes Ingress Controller 1.11, we deprecated the v1alpha1 and v1alpha2 CRD versions. Only v1alpha2 is supported within version 3.0. However, this will be the last Kubernetes Ingress Controller release where this specific version is available. If you're currently using v1alpha2, we strongly recommend upgrading to the v1 version.

Breaking changes

If you're using --configmap-tcp-services, this release changes the default backend configuration for a TCP Service defined via annotation in your ConfigMap. Previously, any backend options defined in the ConfigMap (such as maxconn or server-slots) didn't apply to TCP backends. These options now apply to TCP backends defined via annotation in ConfigMap.

Contributions

]]> HAProxy Kubernetes Ingress Controller's development thrives on community feedback and feature input. We’d like to thank the code contributors who helped make this version possible!

Contributor	Area
Hélène Durand	FEATURE, BUG, TEST
Ivan Matmati	FEATURE, BUG
Dinko Korunić	FEATURE
Olivier Doucet	FEATURE
Fabiano Parente	BUG
Petr Studeny	BUG
jaraics	BUG
Ali Afsharzadeh	BUILD
Zlatko Bratković	BUILD, FEATURE, DOC, TEST

Conclusion

HAProxy Kubernetes Ingress Controller 3.0 represents our commitment to delivering a flexible and efficient platform for managing ingress traffic. By extending our prior CRD support to include TCP CRDs, our Kubernetes solutions can meet even more use cases with less complexity.

To learn more about HAProxy Kubernetes Ingress Controller, follow our blog and browse our Ingress Controller documentation. If you want to see how HAProxy Technologies also provides external load balancing and multi-cluster routing alongside our ingress controller, check out our Kubernetes solutions and our K8s webinar.

]]> Announcing HAProxy Kubernetes Ingress Controller 3.0 appeared first on HAProxy Technologies.

Announcing HAProxy 3.0

Nick Ramirez and Ashley Morris — Wed, 29 May 2024 00:00:00 +0000

]]> Here we are in our twenty-third year, and open source HAProxy is going strong. HAProxy is the world’s fastest and most widely used software load balancer, with over one billion downloads on Docker Hub. It is the G2 category leader in API management, container networking, DDoS protection, web application firewall (WAF), and load balancing.

HAProxy maintains its edge over alternatives with best-in-class load balancing performance and reliability, the flexibility to support a wide variety of workloads, and a programmable and extensible architecture that fits your workflow.

Today, HAProxy 3.0 has arrived, and HAProxy Enterprise 3.0 will be released later this year! In this blog post, we'll cover the changes in a short and digestible format, leaving the longer-format configuration examples and deep dives for follow-up blog posts.

For a live introduction to the new release, register for our webinar HAProxy 3.0: Feature Roundup. Join our experts as we examine new features and updates and participate in the live Q&A.

How to get HAProxy 3.0

You can install HAProxy version 3.0 in any of the following ways:

Run it as a Docker container. View the Docker installation instructions.

Compile it from source. View the compilation instructions.

Major changes

First, let's cover the most important changes in HAProxy 3.0. These changes substantially modify how things were done in previous versions or introduce entirely new capabilities.

Loading TLS certificates with the new crt-store section: The new crt-store configuration section provides a flexible way to store and consume SSL certificates. Replacing crt-list, crt-store separates certificate storage from their use in a frontend. The crt-store section allows you to individually specify the locations of each certificate component, for example, certificates files, key files, and OCSP response files. Aliases provide support for human-friendly names for referencing the certificates more easily on bind lines. The ocsp-update argument is now configured in a crt-store instead of a crt-list.
Limiting glitchy HTTP/2 connections: Some HTTP/2 requests are valid from a protocol perspective but pose problems anyway. For example, sending a single header as a large number of CONTINUATION frames could cause a denial of service. HAProxy now counts these so-called glitches and allows you to set a limit on them. You can also track them in a stick table to identify buggy applications or misbehaving clients.
Assigning GUIDs to configuration objects: The new guid directive available in frontend, backend, and listen sections lets you assign a unique identifier to that section. The server directive also gained a guid argument. For now, the main use is for persisting stats after a reload, since only stats associated with objects having a GUID can be restored.
Persisting stats after a reload: Reloading HAProxy will no longer reset the HAProxy Stats page, as long as you call the new Runtime API command dump stats-file first to save the current state to a file and then load that file with the stats-file configuration directive. Ensure that you've set a GUID on each frontend, backend, listen and server object by using the new guid keywords.
Load balancing Syslog: The feature for load balancing Syslog messages, which was introduced in version 2.9, has progressed so that you can now set weights on server lines in your mode log backends. Meanwhile, the sticky algorithm, which had been limited to log backends, now applies to mode tcp and mode http backends as well.
Log as JSON and CBOR: You can now format log lines as JSON and CBOR. When configuring a custom log format, you will indicate which to use, and then in parentheses set the key for each field.
More data exposed as fetch methods: New fetch methods expose data previously available only within logs. They include fetches that return the number of open HTTP streams for a backend or frontend, the size of the backend queue, the allowed number of streams, and a value that indicates whether a connection got redispatched because a server was unreachable.

Noteworthy changes

Beyond the major changes, there are changes that simplify the configuration, improve performance, or extend existing functionality.

Improving Lua performance: Single-threaded Lua scripts using lua-load will see a performance improvement. This improvement is the result of a change to the loading mechanism, where the maximum number of instructions is now divided by the number of threads. This makes it so that waiting threads have a shorter wait time and share the time slot more evenly. Safeguards are in place to prevent thread contention for threads waiting for the global Lua lock.
Improving stick table performance: Stick tables have received a performance boost due to a change in the locking mechanism. Stick tables are now sharded over multiple tree heads, each having their own lock, and thus reducing lock contention. This means that on systems with many threads, stick table performance improves greatly. On a system with 80 threads, we measured performance gains of approximately 6x. As for systems with low thread counts, performance could be improved by as much as 2x when using peers.
Setting default TLS certificates: When using a solitary frontend to load balance multiple websites, you host different TLS certificates for each site, typically by placing all certificates in a directory and letting HAProxy choose the correct one based on TLS SNI. New in this version, you can use the default-crt argument to indicate which certificate to use when no other certificates match. You can also set different defaults to support RSA and ECC algorithms. In a CRT-List, you can designate a default certificate by adding an asterisk after it.
Controlling which HTTP errors to track: Until now, you could capture in a stick table the count and rate of client HTTP errors (4xx status codes) and server HTTP errors (5xx status codes), but you could not control specifically which status codes were included. This version adds global directives http-err-codes and http-fail-codes that let you set the status codes you care about, allowing you to ignore those that don't matter to you.
Prioritizing traffic on the frontend and backend: HAProxy can modify the header of an IP packet to include the Differentiated Services (DS) field. This field classifies the packet so that the network stack can prioritize it higher or lower in relation to other traffic on the network. New in this version of HAProxy, you can set this field on connections to backend servers in addition to frontend connections to clients. To set the value, use the set-fc-tos and set-bc-tos actions (referring to the old Type of Service (TOS) header field, which has been superseded by DS).
Setting a mark on IP packets on the frontend and backend: With HAProxy, you can set the fwmark on an IP packet, which classifies it so that, for example, it can use a specific routing table. HAProxy 3.0 now supports setting an fwmark on connections to backend servers as well as to clients connected on the frontend. Use the set-fc-mark and set-bc-mark actions.
Creating UUIDv7 identifiers: The uuid fetch method now takes an optional argument that sets the version of the UUID to either 4 or 7. Combine the fetch with the unique-id-format directive and the unique-id fetch method to get an ID that you can attach to log entries.
Configuring virtual ACL and Map files: ACL and Map files no longer require you to create files on disk. By prefixing the name of the file with @virt on an acl line in the HAProxy configuration, you allow HAProxy to start up and access the ACL and Map files as virtual representations only. Then use the Runtime API to add and delete rows in the virtual files. This is especially useful in containerized environments where the hassle of defining storage volumes and mapping volumes to the container's filesystem can seem like a burden. You can also prefix the filename with @opt, which marks the file as optional. In that case, HAProxy will check for the file on the filesystem, but if it doesn't find the file, it will assume the file is virtual.
Relaying to the client or server when a gRPC connection has been aborted: Upon abort by the client, the RST_STREAM reason code can be retrieved from the buffer contents using the fetching sample fs.rst_code. The fetching sample fs.aborted returns true when an abort is received from the client. To detect server aborts, use the corresponding fetching samples bs.rst_code for the return code and bs.aborted for the status.
A change in how servers are mapped in consistent-hash load balancing: When load balancing using a hash-based algorithm, HAProxy must keep track of which server is which. Instead of using numeric IDs to compute hash keys for the servers in a backend, the hash-key directive now supports using the servers’ addresses and ports to compute the hash keys. This is useful in cases where multiple HAProxy processes are balancing traffic to the same set of servers, as each independent HAProxy process will calculate the same hash key and therefore agree on routing decisions, even if its list of servers is in a different order.

Breaking changes

Although this is a major version release, there are only a few breaking changes, as you'll see in the short list below.

Detecting accidental multiple commands sent to the Runtime API: Previously, it was occasionally possible to successfully issue multiple commands, which had the potential to produce unexpected results for long-running commands that may only partially complete. A warning will now be emitted when a \n is detected in a command, and the command will not be accepted. This change has also been backported to ensure that user scripts that utilize this behavior can be remedied.
Rejecting the enabled keyword for dynamic servers: When defining a dynamic server, use of the enabled keyword is now rejected with an error, whereas previously it was only silently ignored.
Stricter parsing of non-standard URIs: Parsing is now more strict during HTTP/1 processing for request target validation. This means that where previously, for compatibility, non-standard-compliant URIs were forwarded as-is for HTTP/1, now some invalid request targets are rejected with a 400-Bad-Request error.
Renamed tune.ssl.ocsp-update: The tune.ssl.ocsp-update global keyword is now named tune.ocsp-update, as ocsp-update is unrelated to SSL tuning.

Conclusion

In the early days of the HAProxy project, it would have been difficult to foresee the multitude of ways people would use HAProxy, or the vast number of organizations that have adopted it at scale. Today, HAProxy is the market leader in software load balancing. That's thanks to the dedication of our open-source community members who write code, test features, document keywords, help newcomers, and evangelize to their organizations. Thank you to all!

HAProxy 3.0 maintains the strong momentum of our open-source load balancer into 2024 with improvements to simplicity, performance, reliability, observability, and security. This introductory blog post barely scratches the surface!

Subscribe to our blog and stay tuned for further deep dives on the latest updates from HAProxy 3.0. And in case you missed it, catch up with the huge new features we announced earlier this month in HAProxy Enterprise 2.9.

Ready to upgrade to HAProxy 3.0? Here’s how to get started.

]]> Announcing HAProxy 3.0 appeared first on HAProxy Technologies.