I agree with the article, FastCGI is better than HTTP for these things.
Though I'd like to make another protocol known: Web Application Socket (WAS). I designed it 16 years ago at my dayjob because I thought FastCGI still wasn't good enough.
Instead of packing bulk data inside frames on the main socket, WAS has a control socket plus two pipes (raw request+response body). Both the WAS application and the web server can use splice() to operate on a pipe, for example. No framing needed. Also, requests are cancellable and the three file descriptors can always be recovered.
Over the years, we used WAS for many of our internal applications, and for our web hosting environment, I even wrote a PHP SAPI for WAS. Quite a large number of web sites operate with WAS internally.
FastCGI and HTTP are at two different levels. HTTP is for data transfer from, say, a browser and a server. FastCGI is for handling that data between the server and an application.
Just now I glanced at the article and it seems the author writes in a confusing way to imply that HTTP and FastCGI are interchangeable and they are not.
fwiw, I used fcgi for a decade for all our web customers.
> FastCGI and HTTP are at two different levels. HTTP is for data transfer from, say, a browser and a server. FastCGI is for handling that data between the server and an application. Just now I glanced at the article and it seems the author writes in a confusing way to imply that HTTP and FastCGI are interchangeable and they are not.
That might be just you. The article is littered with the qualifier "for reverse proxies", including in the title and two section headers, and "as the protocol between reverse proxies and backends" in the second paragraph. I don't know how it could be any more clear on this point.
The max_k comment you've quoted includes "for these things"; context clues suggest by "these things" he also means to limit his comment to the reverse proxy <-> backend leg.
The comment was made in response to the article. This whole discussion is in the context of the article. You choosing to ignore that doesn't mean everyone else has to let you.
> FastCGI and HTTP are at two different levels. HTTP is for data transfer from, say, a browser and a server. FastCGI is for handling that data between the server and an application.
Not entirely correct. A reverse proxy can either speak HTTP, or a different protocol such as FastCGI with the application server. The article is talking about that communication.
They are not interchangeable for the browser-to-server communication, but they are for the server-to-application piece.
The article points out that HTTP and FastCGI are both options for reverse proxies to communicate to the downstream server. I didn't find a reference to them being interchangeable outside of that context. If there is or was one please quote it.
> imply that HTTP and FastCGI are interchangeable and they are not.
But they are interchangeable!
FastCGI and HTTP/1.1 are indeed on the same level. Both are transport protocols for HTTP requests.
It would be technically possible to implement FastCGI as alternate transport protocols in browsers and web servers, just like HTTP/2 (SPDY) and HTTP/3 (QUIC) are alternate transport protocols for HTTP requests.
(This is not what the article and my comment are about, as others already pointed out.)
This is quite an interesting article for its omissions.
I remember the great FastCGI vs. SCGI vs. HTTP wars: I was founding a Web2.0 startup right at the time these technologies were gaining adoption, and so was responsible for setting up the frontend stack. HTTP won because of simplicity: instead of needing to introduce another protocol into your stack, you can just use HTTP, which you already needed to handle at the gateway. Now all sorts of complex network topologies became trivial: you could introduce multiple levels of reverse proxies if you ran out of capacity; you could have servers that specialized in authentication or session management or SSL termination or DDoS filtering or all the other cross-cutting concerns without them needing to know their position in the request chain; and you could use the same application servers for development, with a direct HTTP connection, as you did in production, where they'd sit behind a reverse proxy that handled SSL and authentication and abuse detection.
It also helped that nginx was lots faster than most FastCGI/SCGI modules of the time, and more robust. I'd initially setup my startup's stack as HTTP -> Lighttpd -> FastCGI -> Django, but it was way slower than just using nginx.
The use of HTTP was basically the web equivalent of the End-to-End Principle [1] for TCP/IP. It's the idea that the network and its protocols should be agnostic to what's being transmitted, and all application logic should be in nodes of the network that filter and redirect packets accordingly. This has been a very powerful principle and shouldn't be discarded lightly.
The observation the article makes is that for security, it's often better to follow the Principle of Least Privilege [2] rather than blindly passing information along. Allowlist your communications to only what you expect, so that you aren't unwittingly contributing to a compromise elsewhere in the network.
And the article is highlighting - not explicitly, but it's there - the tension between these two principles. E2E gives you flexibility, but with flexibility comes the potential for someone to use that flexibility to cause harm. PoLP gives you security, but at the cost of inflexibility, where your system can only do what you designed it to do and cannot easily adapt to new requirements.
> The use of HTTP was basically the web equivalent of the End-to-End Principle [1] for TCP/IP.
I don't think the analogy works, not in the context of connection caching and multiplexing. An intermediate gateway multiplexing multiple HTTP requests over another HTTP channel, where that channel is the terminal leg directly to the listening service (i.e. requests aren't demultiplexed before hitting the application socket), fundamentally violates the logic to end-to-end in multiple ways. The analogy only works, if at all, if you preserve 1:1 connection symmetry.
All the reverse proxy exploits can be traced directly back to violating end-to-end.
If the analogy were true, then SMTP delivery across multiple MXs would be end-to-end as well. It's not, and you see many of the same issues as with reverse proxies, including messaging boundary desync'ing.
I guess you're trying to analogize HTTP requests as messages, but it falls apart almost immediately in the context of all the hairy details. The nature of TCP and HTTP semantics and the various concrete protocol details throws a wrench into things, with predictable consequences.
The end-to-end principle doesn't permit playing fast and loose with semantics. It demands very hard, rigid boundaries regarding state management and transport layering. That's the whole point. "Mostly" end-to-end is not end-to-end, not even a little bit.
The HTTP semantics are useful for anyone developing a web app but the wire protocol of HTTP itself is awful. Multiplexing didn’t arrive until HTTP 2.0 for example. So using HTTP for communication between a reverse proxy and a backend is very wasteful. There are security issues, such as when different parsers could even disagree on where the boundaries of a request ends.
Google for example has long wrapped HTTP into their own Stubby protocol between their frontline web servers and applications; it’s much faster and more featureful than using the HTTP wire protocol. It’s something that a typical company doesn’t need, but once the scale increases it becomes worthwhile to justify using a different wire protocol and developing all the tooling around that new wire protocol.
Won't argue with that, but it's a classic example of "Worse is better" [1]. It was simple and "good enough". Being ubiquitous is often more important than being efficient.
Most of the arguments for using HTTP reverse proxying over FastCGI or SCGI came down to ubiquity. It let you do things (like connect directly to your app servers with a web browser) that you couldn't do with FastCGI.
> Multiplexing didn’t arrive until HTTP 2.0 for example. So using HTTP for communication between a reverse proxy and a backend is very wasteful.
HTTP 2.0 multiplexing is tcp in tcp, it's asking for trouble. Just open more connections and let tcp be your multiplex. Depending on your connection rate, you can't really do 64k connections per frontend ip to each service ip:port, but if your rate isn't too high, 20-30k is feasible. most http based applications don't need or benefit from anywhere near that level of concurrency on frontend to backend. But if it's not enough, you can add more ips to the frontend or backend, or more ports to the backend.
I'm pretty sympathetic to the argument for FastCGI or similar as the protocol for frontend to backend though; having client set headers clearly separate from frontend set headers is very nice, and having clear agreement on message boundaries is of obvious value. Unless you're just doing a straight tcp proxy, in which case ProxyProtocol is good enough to transfer the original IPs and then pass data as-is.
I don't think that is how it happened. Yes, there was a SCGI/FastCGI chism, but it was mostly the Python ecosystem that used SCGI and the rest of the world was on FastCGI. Unless you were PHP, in which case you were on mod_php because it was an unmovable juggernaut.
Apache had a FastCGI module early on, but it received little love and was not that widely used. For many people, FastCGI was synonymous with nginx and lighttpd because these webservers came with support out of the box (nginx later got modules just as Apache).
When PHP finally got PHP-FPM, that gigantic ecosystem slowly started moving, sometime in the late 00s, and then FastCGI really took over. Almost. Because at the same time, the cloud era started and brought the "just use HTTP bro" mindset. Amazon has always used HTTP internally since the 90s and I would guess that probably carried over to AWS?
So nowadays, PHP still being a silent juggernaut, is now mostly on FastCGI, while most other have moved to cloud era standards and use HTTP. Go, for example, matured at this time and all the tutorials use straight HTTP proxying.
Yes, FastCGI is the much more robust standard, but you will encounter friction if you use it on your cloud native application. For regular servers and VMs it is still common.
There was a small window where everyone was trying to move off Apache/mod_perl etc, coming up with all sorts of ways to talk to the backend faster… but then nginx walked into the chat and killed the 10k problem along with having its easy but fancy upstreaming, and that was that.
Nginx made horizontal scaling a winch, and because of that rewriting your backend to handle HTTP and FastCGI etc was more effort than it was worth.
It makes a lot of sense. Most large organizations are collections of independent teams, many of whom don't communicate with each other other than sending quarterly OKRs and status updates back to their VP. The E2E principle is what allows them to each do their thing, agnostic to what the other servers handling the request are doing, and then let higher levels of the organization reconfigure and provision the system based on the needs of the moment.
Large organizations have a well-known pattern for how to handle this tension between the E2E principle and the PoLP. It's a firewall. As per the E2E principle, this is a node in the system, usually placed near the outside, which is responsible for inspecting and sanitizing every request that enters the system. The input is untrusted external requests that may have arbitrary binary data. The output is the particular subset of HTTP that form valid requests for the server, sanitized to a minimal grammar and now trusted because you reject every packet that wasn't a well-formed request for your particular service. As an added bonus, now you can collect stats on who is sending these malformed requests, which lets you do things like DDoS protection or calling their ISP or contacting the FBI.
The article even admits this: the right solution to untrusted headers is to strip out everything you aren't explicitly expecting at the reverse proxy. If you didn't know True-Client-IP exists, don't pass it on. Allowlist and block everything by default, don't blocklist and allow everything by default.
Putting security-critical logic in proxies is a violation of the End-to-End Principle, not an example of it. That doesn't mean it's a bad thing; as ragall notes, the End-to-End Principle doesn't make sense here.
You're correct that if the proxy removes all unknown headers, you're safe (with HTTP/2). But that sounds extremely inconvenient - before your application can use a new header, you have to talk to the team who runs the proxy. And popular reverse proxy software doesn't do that by default so it remains a huge footgun for the unwary. All completely avoided with FastCGI.
Set proxy_pass_request_headers off, and then explicitly proxy_set_header each individual header you want to forward to the variable representing it in nginx config.
Or just use CloudFlare Tunnel, which gives you a bunch of other DDoS and abuse protection and keeps your app server off the public Internet.
> Most large organizations are collections of independent teams, many of whom don't communicate with each other other than sending quarterly OKRs and status updates back to their VP.
You describe an organizational failure, where different teams are allowed to do whatever they like instead of having a proper platform team, which can enforce security and standards for the benefit of interoperability. It's not an argument in favour of transparent end-to-end behaviour in datacenters.
What I dislike about nginx is ... the documentation. I find it virtually useless because of that.
Sadly httpd went the way of "let's make the configuration difficult"; I abandoned it when they suddenly changed the configuration format. I could have adjusted, but I switched to lighttpd (and also, past that point I let ruby autogenerate any configuration format, so technically I could return to httpd, but I don't want to - I think people who develop webservers, need to think about forcing people to adjust to any new format. If there is a "simple" decision to willy-nilly switch the configuration format, perhaps enable e. g. yaml-configuration in ADDITION, so that we don't have to go through new if-clause config statements suddenly).
Nginx is extremely well represented in AI training material, so virtually every decent model - even locally hosted ones - can deliver you solid answers about its config settings.
Call me an old crusty Luddite if you will, after all you'd not be wrong, but…
I feel that if I can't work something out without asking a generative ML model, then I probably don't understand it well enough to properly assess the generated answer, and if I didn't understand the documentation well enough in the first place then “verify it against the documentation” is not a suitable answer, so I probably shouldn't be self-hosting that system on the open network.
It is quite irritating that the existence of generative models is apparently becoming an acceptable excuse for inadequate documentation. Rather than suggesting that I ask copilot when the documentation Azure is lacking, perhaps MS should as copilot to generate some better documentation (and have their human domain experts review it for correctness) so we have good documentation to work from. It strikes me that them using a bunch of LLM crunching power up-front is likely to me more efficient than a great many of us spending smaller amounts or resources each (many of us asking the same questions) at the point of consumption.
> That said, using a vintage technology has some downsides. It was never updated to support WebSockets
With widespread browser support for WHATWG streams, it's pretty easy to implement your own WebSockets over long-lived HTTP requests. Basically you just send a byte stream and prepend each message with a header, which can just be a size in many cases.
Advantages over WebSockets:
* No special path in your server layer like you need for WebSocket.
* Backpressure
* You get to take advantage of HTTP/2/3 improvements for free
* Lower framing overhead
Unfortunately AFAIK it's still not supported to still be streaming your request body while receiving the response, so you need a pair of requests for full bidirectional streaming.
The untrusted header problem could potentially be fixed by having the reverse proxy embed all the trusted information in a specific header, and then it just has to make sure that one header is stripped from the request. Unfortunately, there isn't (yet) a standard for that.
Or you could use something like haproxy's proxy protocol (although that may not support all the information you want, and doesn't work for multiplexing).
Edit: actually the "Forwarded" header kind of fills that niche. Although you may want extensions for things like the client certificate.
FastCGI has "parameters" and HTTP headers are special parameters starting with "HTTP_" (mimicking CGI's environment variables). All parameters not starting with "HTTP_" can be trusted because only the web server (= FastCGI client) can construct them.
I’ve rediscovered plain old CGI as a great way for users to “vibe code” custom pages on our platform. [1]
The scenario is we have our first party task lists and data viewers, but often users want to highly customize it. Say build a Kanban view or a custom dashboard with data filters and charts.
The box has a coding agent which means the user can code anything vs us building traditional report builder tools.
Go’s stdlib has good support on both the server side and user space. The coding agent makes a page-name/main.go that talks CGI and the server delegates requests to it.
It’s all “person scale” data and page views so no real need to optimize with fast CGI even.
> I still don't love how CGI uses environment variables.
Neither do I. They really only make sense in the context of a request which was actually to a CGI script resident in a document root - they're an exceptionally awkward way of describing other HTTP requests, especially ones which aren't being served from a document root. And there's a lot of information lost in translation, like the order and original capitalization of HTTP headers. (Not that these things are supposed to matter, but still.)
The mystery is why uWSGI isn't more widely used. Perhaps the name does not help. It has little to do with WSGI just as FastCGI is unrelated to CGI.
It is a tiny binary protocol, with frames just as FastCGI. The reference server works with several languages, I've used it over the years mostly with Python but also Ruby and Perl. It is a small C executable with all the practical features one need for web hosting: Draining backends, autoscaling, logging, chrooted backends, everything.
Very few FastCGI servers are this mature. Unlike FastCGI, it has been extended to support websockets and async.
I have used it in production at several places for many years and have nothing but praise for it. It feels like this weird unknown secret for web operations. Unfortunately, it sees lesser use now in the cloud era, and development seems to have all but stopped. It still works and is still reliable but the writing is probably on the wall. However nothing comes close in terms of speed, simplicity, and features.
I was reading the article and thinking, I wonder what that proxy I'm using in Apache is using, it's really fast and I've had a lot of luck with it. OMG, so I've been using FastCGI all this time and had no idea, well that's awesome. :)
Does anyone remember mongrel2? A neat web server that used zeromq for backend communication (based on scgi). I always felt it had huge potential. It was also kinda famous in the early days of HN, back in the days when RoR was in vogue etc.
FCGI is also an orchestration system. It launches more server tasks when the load goes up, shuts them down when the load decreases, and launches new copies of tasks if they crash. It's like single-system Kubernetes.
> It launches more server tasks when the load goes up, shuts them down when the load decreases
In my experience, this isn't a good feature. It sounds nice, but it can often mean everything runs fine while your load is low, but when your load gets high, you spawn more workers and run out of memory. It's much better to have a static number of workers in my experience.
It's useful if you have multiple FCGI programs that handle different kinds of requests. Depending on what's being requested, programs start up and shut down.
Most of the stuff I've done for reverse proxies has been pretty straightforward and just using the stuff built into Nginx, but I have to admit that it wouldn't have even occurred to me to use FastCGI if I needed something more elaborate.
I used FastCGI a bit about ten years ago to "convert" some C++ code I wrote to work on the web, but admittedly I haven't used it much since then.
Also, embedded servers are now much much much more popular. Stuff an HTTP server directly into your application and do whatever you gotta do without gateways.
That is way! Unfortunately, sometimes you have to do path-based routing to different backends, and now you're back to needing a proxy between your clients and your applications.
This is the way only if you're operating in a trusted environment (eg. homelab, intranet) or you're sticking CloudFlare or some other "reverse proxy as a service" in front of it. If you expose an embedded HTTP app server directly to the Internet you're almost guaranteed to get pwned, as the Internet has now become an extremely hostile place.
These are often not enough ‘battle-tested” and come with a warning to never expose to public internet. So then you put a WAF in front of it, and you are back to HTTP reverse proxy setup.
This seems like really bad advice or am i missing something?
Using fastcgi requires you write your app to serve fastcgi.
The upside of serving http/1.1 instead of fastcgi is that devs can instantly use their browser to test things instead of having to setup a reverse proxy on their machine.
The bad parts of http/1.1 are fixed equally well by both http/2.0 and fastcgi. So just use http/2.0 and you get the proper framing as well as browser support.
Please see the section about untrusted headers - this is not fixed by HTTP/2.
You're right that being able to point your browser right at the app is very convenient. With Go, you can have a command line flag that switches between http.Serve (for development) and fcgi.Serve (for production).
In my experience having different serving paths for dev vs production is a recipe for annoying issues. I try to make dev as similar to prod as possible.
I’m not sure, I don’t dismiss fcgi outright here, I find the arguments for it compelling (not a huge fan of http for many reasons) but it has to be really worth it to break the consistency of using http everywhere.
If you want your dev environment to be as similar to prod as possible, and you use a proxy in prod, then you should use a proxy in dev also. I was presenting a solution to someone who doesn't want to do that.
I think perhaps I was unclear. I don’t mean the entire dev environment should mirror prod (although it’s great if you can do this for end to end testing). I just mean it’s desirable if the process you’re working on operates the same way in dev as in prod.
I've built a lot of API backends with Perl and FCGI::ProcManager, letting nginx (and Apache HTTPd in the past) front everything. For me it has been a pleasantly simple, incredibly robust and high-performing setup with no mess to speak of.
> Only if True-Client-IP doesn't exist does it use X-Real-IP. So even if your proxy does the right thing with X-Real-IP, you can still be pwned by an attacker sending a True-Client-IP header.
Can we just take a moment to appreciate the absurdity of HTTP headers for a moment? We have X-Forwarded-For, X-Real-IP, each CDN has their own custom flavored one. Some of them are a comma-separated list, and usually ends up having an IP of your own LB uselessly added in there (I know why, it's just not helpful). All of them might be inserted by a malicious user-agent. I guess nobody could agree on how all the various trusted servers in the pipeline should convey the important bit.
I guess it fits in quite well with the absurdity of the User-Agent header, which has come so far in absurdity that Apple decided to fully kill it by just sending utterly fake nonsense (false OS version, etc) in the name of "pRiVaCy."
I think there is a lot of merit to this argument, however, FastCGI defers to CGI/1.1 for `PATH_INFO`, etc., which is lossy as it must be URL-decoded and therefore cannot represent encoded slashes, `%2F`. (Some implementations also collapse `//` to `/` in path, but this is an issue in various HTTP implementations too.)
It is less expressive than HTTP in ways that may or may not be important to your application; I prefer accurate URL handling.
Indeed. I'm sure that someone will butt in with "it's just a bad implementation!" but the whole bit about allowlisting communications will cause flashbacks in those of us who had all our PUT requests just quit working on an IIS server.
I‘ve had good experiences with FastCGI back when Perl was popular.
These days, WebTransport is the new sexy thing.
Probably not a real FastCGI replacement.
As I understand it FastCGI doesn't handle websockets, which is a shame. It should be able to handle SSE though since that's effectively just a regular slow-loading/streaming HTTP response.
There is no way to do what you ask for without having a local CA. And once you have it, any existing caching proxy will work. Now you have moved all security to the proxy, but for some applications it is worth it.
I'd love for CGI to be updated, kind of merging what works and not really caring about what does not work. Getting a .cgi file to work on Linux is really easy. Naturally you get more leverage with e. g. rails, but there is also a lot more complexity and I really hate intrinsic complexity.
CGI and FastCGI are two different things in two different domains. Well the domains are not that different but enough that CGI solves a real problem and makes sense and FastCGI does not. CGI is the interface between a HTTP transaction and a process. It answers the question "How do we turn a HTTP request into executing a process?". FastCGI answers the question of "How do we turn a HTTP request into a FastCGI request". a convolution that leaves you asking "Why are we jumping through this hoop? Is FastCGI actually bringing anything to the table?, Is it actually more difficult to have a HTTP server instead of a FastCGI server if they are so trivially connected?
I am halfway convinced the only reason FastCGI exists is we had got in a mindset that executable code in a HTTP context had to run via the Common Gateway Interface and when we wanted to to change to a persistent process model it had to have the CGI name as well. Well FastCGI to the rescue it does exactly what HTTP does but is not HTTP and most importantly has CGI in the name.
As to the articles complaint, "A HTTP relay server had a bug. Therefore HTTP is intrinsically bad". Well.. it failed to convince me. I am not exactly in that domain(backend web development) so my view is not worth much. But I feel that your internal HTTP(application) servers should be built as if they were going directly on the open web. Then you put some relay servers in front in order to block, balance and route requests. But avoid putting too many smarts in the relay servers. A smart network is almost always a bad idea. try and stick with a dumb network and smart edges.
Your question is why we have http servers and http middleware at all. Running an application directly on an endpoint makes sense for embedded application, but for everything else you will end up with a separate http terminating layer for a number of reasons.
It makes maintenance so much easier, when you can scale request backends independently of request frontends. It makes sense that the application doesn't have access to TLS keys and can't bind end user facing ports directly. The day you want a separate access log from the application log, and separate instrumentation of the frontends because of some hard to track down bug you will be thankful it is there.
So, you want to deploy some sort of proxy. Will you run FastCGI, SGI, WSGI, uWSGI, plain http, or a number of proprietary load balancer protocols? If you choose http you have to take care to wash your headers and be very careful with pipelining so front and back end expect the same behaviour (which is true for several of the proprietary protocols too). Otherwise there can be catastrophic security implications.
It is probably true that FastCGI was designed as a persistent CGI. But as soon as you have persistence, the http server needs to forward transactions to the persistent backend, and what you have built is a proxy. There is no way around it.
> Is FastCGI actually bringing anything to the table?,
Yes; it removes the need for the application server to perform parsing of HTTP, which is notoriously difficult to consistently do safely. The application server can use the safer and simpler FastCGI protocol rather than try to support the full HTTP spec.
> Is it actually more difficult to have a HTTP server instead of a FastCGI server if they are so trivially connected?
IME, yes. HTTP (the spec) is full of footguns. FastCGI has fewer footguns. HTTP requires a long dependency chain in your application. FastCGI requires maybe a single library.
> But I feel that your internal HTTP(application) servers should be built as if they were going directly on the open web.
I feel that too, but which developer do you know writes their own HTTP server inside their application? They all use the most popular server via a library or framework, almost all of which warn not to open that to the public internet.
What do you propose they do? They use a framework/library and get a warning not to expose it to the public internet, they don't use a framework/library and odds are good that they coded some vulnerability into it.
Embedding http server (or fcgi one for that matter) is trivial in most languages and also makes local development simpler. CGI was horrible idea back then and it continues to be
87 comments:
I agree with the article, FastCGI is better than HTTP for these things.
Though I'd like to make another protocol known: Web Application Socket (WAS). I designed it 16 years ago at my dayjob because I thought FastCGI still wasn't good enough.
Instead of packing bulk data inside frames on the main socket, WAS has a control socket plus two pipes (raw request+response body). Both the WAS application and the web server can use splice() to operate on a pipe, for example. No framing needed. Also, requests are cancellable and the three file descriptors can always be recovered.
Over the years, we used WAS for many of our internal applications, and for our web hosting environment, I even wrote a PHP SAPI for WAS. Quite a large number of web sites operate with WAS internally.
It's all open source:
- library: https://github.com/CM4all/libwas - documentation: https://libwas.readthedocs.io/en/latest/ - non-blocking library: https://github.com/CM4all/libcommon/tree/master/src/was/asyn... - our web server: https://github.com/CM4all/beng-proxy - WebDAV: https://github.com/CM4all/davos - PHP fork with WAS SAPI: https://github.com/CM4all/php-src
>>FastCGI is better than HTTP for these things.
FastCGI and HTTP are at two different levels. HTTP is for data transfer from, say, a browser and a server. FastCGI is for handling that data between the server and an application.
Just now I glanced at the article and it seems the author writes in a confusing way to imply that HTTP and FastCGI are interchangeable and they are not.
fwiw, I used fcgi for a decade for all our web customers.
>> FastCGI is better than HTTP for these things.
> FastCGI and HTTP are at two different levels. HTTP is for data transfer from, say, a browser and a server. FastCGI is for handling that data between the server and an application. Just now I glanced at the article and it seems the author writes in a confusing way to imply that HTTP and FastCGI are interchangeable and they are not.
That might be just you. The article is littered with the qualifier "for reverse proxies", including in the title and two section headers, and "as the protocol between reverse proxies and backends" in the second paragraph. I don't know how it could be any more clear on this point.
The max_k comment you've quoted includes "for these things"; context clues suggest by "these things" he also means to limit his comment to the reverse proxy <-> backend leg.
I didn't quote anything from the article. I was responding to the comment, not the article.
The comment was made in response to the article. This whole discussion is in the context of the article. You choosing to ignore that doesn't mean everyone else has to let you.
> FastCGI and HTTP are at two different levels. HTTP is for data transfer from, say, a browser and a server. FastCGI is for handling that data between the server and an application.
Not entirely correct. A reverse proxy can either speak HTTP, or a different protocol such as FastCGI with the application server. The article is talking about that communication.
They are not interchangeable for the browser-to-server communication, but they are for the server-to-application piece.
Your last sentence is exactly what I said.
You didn't though. You may have intended to?
The article points out that HTTP and FastCGI are both options for reverse proxies to communicate to the downstream server. I didn't find a reference to them being interchangeable outside of that context. If there is or was one please quote it.
I was responding to the comment, not the article.
The article is really exclusively about the reverse proxy server to server use case, not client to server. The title even says it.
I responded to the comment, not the article.
> imply that HTTP and FastCGI are interchangeable and they are not.
But they are interchangeable!
FastCGI and HTTP/1.1 are indeed on the same level. Both are transport protocols for HTTP requests.
It would be technically possible to implement FastCGI as alternate transport protocols in browsers and web servers, just like HTTP/2 (SPDY) and HTTP/3 (QUIC) are alternate transport protocols for HTTP requests.
(This is not what the article and my comment are about, as others already pointed out.)
I feel like the author of an alternative protocol probably knows these things.
I think the author mentions HTTP because many people use it where they could be using FastCGI and just don’t.
Please note that it's called FastCGI, not FastHTTP.
And yet I’ve seen in production the (ab)use of HTTP where fcgi would’ve been a much better fit.
This is quite an interesting article for its omissions.
I remember the great FastCGI vs. SCGI vs. HTTP wars: I was founding a Web2.0 startup right at the time these technologies were gaining adoption, and so was responsible for setting up the frontend stack. HTTP won because of simplicity: instead of needing to introduce another protocol into your stack, you can just use HTTP, which you already needed to handle at the gateway. Now all sorts of complex network topologies became trivial: you could introduce multiple levels of reverse proxies if you ran out of capacity; you could have servers that specialized in authentication or session management or SSL termination or DDoS filtering or all the other cross-cutting concerns without them needing to know their position in the request chain; and you could use the same application servers for development, with a direct HTTP connection, as you did in production, where they'd sit behind a reverse proxy that handled SSL and authentication and abuse detection.
It also helped that nginx was lots faster than most FastCGI/SCGI modules of the time, and more robust. I'd initially setup my startup's stack as HTTP -> Lighttpd -> FastCGI -> Django, but it was way slower than just using nginx.
The use of HTTP was basically the web equivalent of the End-to-End Principle [1] for TCP/IP. It's the idea that the network and its protocols should be agnostic to what's being transmitted, and all application logic should be in nodes of the network that filter and redirect packets accordingly. This has been a very powerful principle and shouldn't be discarded lightly.
The observation the article makes is that for security, it's often better to follow the Principle of Least Privilege [2] rather than blindly passing information along. Allowlist your communications to only what you expect, so that you aren't unwittingly contributing to a compromise elsewhere in the network.
And the article is highlighting - not explicitly, but it's there - the tension between these two principles. E2E gives you flexibility, but with flexibility comes the potential for someone to use that flexibility to cause harm. PoLP gives you security, but at the cost of inflexibility, where your system can only do what you designed it to do and cannot easily adapt to new requirements.
[1] https://en.wikipedia.org/wiki/End-to-end_principle
[2] https://en.wikipedia.org/wiki/Principle_of_least_privilege
> The use of HTTP was basically the web equivalent of the End-to-End Principle [1] for TCP/IP.
I don't think the analogy works, not in the context of connection caching and multiplexing. An intermediate gateway multiplexing multiple HTTP requests over another HTTP channel, where that channel is the terminal leg directly to the listening service (i.e. requests aren't demultiplexed before hitting the application socket), fundamentally violates the logic to end-to-end in multiple ways. The analogy only works, if at all, if you preserve 1:1 connection symmetry.
All the reverse proxy exploits can be traced directly back to violating end-to-end.
If the analogy were true, then SMTP delivery across multiple MXs would be end-to-end as well. It's not, and you see many of the same issues as with reverse proxies, including messaging boundary desync'ing.
I guess you're trying to analogize HTTP requests as messages, but it falls apart almost immediately in the context of all the hairy details. The nature of TCP and HTTP semantics and the various concrete protocol details throws a wrench into things, with predictable consequences.
The end-to-end principle doesn't permit playing fast and loose with semantics. It demands very hard, rigid boundaries regarding state management and transport layering. That's the whole point. "Mostly" end-to-end is not end-to-end, not even a little bit.
The HTTP semantics are useful for anyone developing a web app but the wire protocol of HTTP itself is awful. Multiplexing didn’t arrive until HTTP 2.0 for example. So using HTTP for communication between a reverse proxy and a backend is very wasteful. There are security issues, such as when different parsers could even disagree on where the boundaries of a request ends.
Google for example has long wrapped HTTP into their own Stubby protocol between their frontline web servers and applications; it’s much faster and more featureful than using the HTTP wire protocol. It’s something that a typical company doesn’t need, but once the scale increases it becomes worthwhile to justify using a different wire protocol and developing all the tooling around that new wire protocol.
Won't argue with that, but it's a classic example of "Worse is better" [1]. It was simple and "good enough". Being ubiquitous is often more important than being efficient.
Most of the arguments for using HTTP reverse proxying over FastCGI or SCGI came down to ubiquity. It let you do things (like connect directly to your app servers with a web browser) that you couldn't do with FastCGI.
[1] https://dreamsongs.com/RiseOfWorseIsBetter.html
> Multiplexing didn’t arrive until HTTP 2.0 for example. So using HTTP for communication between a reverse proxy and a backend is very wasteful.
HTTP 2.0 multiplexing is tcp in tcp, it's asking for trouble. Just open more connections and let tcp be your multiplex. Depending on your connection rate, you can't really do 64k connections per frontend ip to each service ip:port, but if your rate isn't too high, 20-30k is feasible. most http based applications don't need or benefit from anywhere near that level of concurrency on frontend to backend. But if it's not enough, you can add more ips to the frontend or backend, or more ports to the backend.
I'm pretty sympathetic to the argument for FastCGI or similar as the protocol for frontend to backend though; having client set headers clearly separate from frontend set headers is very nice, and having clear agreement on message boundaries is of obvious value. Unless you're just doing a straight tcp proxy, in which case ProxyProtocol is good enough to transfer the original IPs and then pass data as-is.
Don’t forget http pipelining!
I don't think that is how it happened. Yes, there was a SCGI/FastCGI chism, but it was mostly the Python ecosystem that used SCGI and the rest of the world was on FastCGI. Unless you were PHP, in which case you were on mod_php because it was an unmovable juggernaut.
Apache had a FastCGI module early on, but it received little love and was not that widely used. For many people, FastCGI was synonymous with nginx and lighttpd because these webservers came with support out of the box (nginx later got modules just as Apache).
When PHP finally got PHP-FPM, that gigantic ecosystem slowly started moving, sometime in the late 00s, and then FastCGI really took over. Almost. Because at the same time, the cloud era started and brought the "just use HTTP bro" mindset. Amazon has always used HTTP internally since the 90s and I would guess that probably carried over to AWS?
So nowadays, PHP still being a silent juggernaut, is now mostly on FastCGI, while most other have moved to cloud era standards and use HTTP. Go, for example, matured at this time and all the tutorials use straight HTTP proxying.
Yes, FastCGI is the much more robust standard, but you will encounter friction if you use it on your cloud native application. For regular servers and VMs it is still common.
To summarise, nginx!
There was a small window where everyone was trying to move off Apache/mod_perl etc, coming up with all sorts of ways to talk to the backend faster… but then nginx walked into the chat and killed the 10k problem along with having its easy but fancy upstreaming, and that was that.
Nginx made horizontal scaling a winch, and because of that rewriting your backend to handle HTTP and FastCGI etc was more effort than it was worth.
The end-to-end principle within a datacenter makes little sense and, as shown in the article, ends up enabling insecure behaviour.
It makes a lot of sense. Most large organizations are collections of independent teams, many of whom don't communicate with each other other than sending quarterly OKRs and status updates back to their VP. The E2E principle is what allows them to each do their thing, agnostic to what the other servers handling the request are doing, and then let higher levels of the organization reconfigure and provision the system based on the needs of the moment.
Large organizations have a well-known pattern for how to handle this tension between the E2E principle and the PoLP. It's a firewall. As per the E2E principle, this is a node in the system, usually placed near the outside, which is responsible for inspecting and sanitizing every request that enters the system. The input is untrusted external requests that may have arbitrary binary data. The output is the particular subset of HTTP that form valid requests for the server, sanitized to a minimal grammar and now trusted because you reject every packet that wasn't a well-formed request for your particular service. As an added bonus, now you can collect stats on who is sending these malformed requests, which lets you do things like DDoS protection or calling their ISP or contacting the FBI.
The article even admits this: the right solution to untrusted headers is to strip out everything you aren't explicitly expecting at the reverse proxy. If you didn't know True-Client-IP exists, don't pass it on. Allowlist and block everything by default, don't blocklist and allow everything by default.
Putting security-critical logic in proxies is a violation of the End-to-End Principle, not an example of it. That doesn't mean it's a bad thing; as ragall notes, the End-to-End Principle doesn't make sense here.
You're correct that if the proxy removes all unknown headers, you're safe (with HTTP/2). But that sounds extremely inconvenient - before your application can use a new header, you have to talk to the team who runs the proxy. And popular reverse proxy software doesn't do that by default so it remains a huge footgun for the unwary. All completely avoided with FastCGI.
Can you recommend a reverse proxy that supports white-listing of headers? nginx doesn't seem to.
Had to Google since it's been almost 20 years since I used nginx directly:
https://serverfault.com/questions/1033131/filter-to-only-pas...
Set proxy_pass_request_headers off, and then explicitly proxy_set_header each individual header you want to forward to the variable representing it in nginx config.
Or just use CloudFlare Tunnel, which gives you a bunch of other DDoS and abuse protection and keeps your app server off the public Internet.
Thank you, I somehow missed that.
> Most large organizations are collections of independent teams, many of whom don't communicate with each other other than sending quarterly OKRs and status updates back to their VP.
You describe an organizational failure, where different teams are allowed to do whatever they like instead of having a proper platform team, which can enforce security and standards for the benefit of interoperability. It's not an argument in favour of transparent end-to-end behaviour in datacenters.
What I dislike about nginx is ... the documentation. I find it virtually useless because of that.
Sadly httpd went the way of "let's make the configuration difficult"; I abandoned it when they suddenly changed the configuration format. I could have adjusted, but I switched to lighttpd (and also, past that point I let ruby autogenerate any configuration format, so technically I could return to httpd, but I don't want to - I think people who develop webservers, need to think about forcing people to adjust to any new format. If there is a "simple" decision to willy-nilly switch the configuration format, perhaps enable e. g. yaml-configuration in ADDITION, so that we don't have to go through new if-clause config statements suddenly).
I've been copying/modifying the same nginx config file for like 15 years
Little tweak here, little tweak there...
Nginx is extremely well represented in AI training material, so virtually every decent model - even locally hosted ones - can deliver you solid answers about its config settings.
Call me an old crusty Luddite if you will, after all you'd not be wrong, but…
I feel that if I can't work something out without asking a generative ML model, then I probably don't understand it well enough to properly assess the generated answer, and if I didn't understand the documentation well enough in the first place then “verify it against the documentation” is not a suitable answer, so I probably shouldn't be self-hosting that system on the open network.
It is quite irritating that the existence of generative models is apparently becoming an acceptable excuse for inadequate documentation. Rather than suggesting that I ask copilot when the documentation Azure is lacking, perhaps MS should as copilot to generate some better documentation (and have their human domain experts review it for correctness) so we have good documentation to work from. It strikes me that them using a bunch of LLM crunching power up-front is likely to me more efficient than a great many of us spending smaller amounts or resources each (many of us asking the same questions) at the point of consumption.
> That said, using a vintage technology has some downsides. It was never updated to support WebSockets
With widespread browser support for WHATWG streams, it's pretty easy to implement your own WebSockets over long-lived HTTP requests. Basically you just send a byte stream and prepend each message with a header, which can just be a size in many cases.
Advantages over WebSockets:
* No special path in your server layer like you need for WebSocket.
* Backpressure
* You get to take advantage of HTTP/2/3 improvements for free
* Lower framing overhead
Unfortunately AFAIK it's still not supported to still be streaming your request body while receiving the response, so you need a pair of requests for full bidirectional streaming.
Please be aware that there is a web standard for this since quite some time. See server-sent events and the EventSource interface:
https://developer.mozilla.org/en-US/docs/Web/API/Server-sent... https://developer.mozilla.org/en-US/docs/Web/API/EventSource
That can be used with https://mercure.rocks :)
The untrusted header problem could potentially be fixed by having the reverse proxy embed all the trusted information in a specific header, and then it just has to make sure that one header is stripped from the request. Unfortunately, there isn't (yet) a standard for that.
Or you could use something like haproxy's proxy protocol (although that may not support all the information you want, and doesn't work for multiplexing).
Edit: actually the "Forwarded" header kind of fills that niche. Although you may want extensions for things like the client certificate.
FastCGI has "parameters" and HTTP headers are special parameters starting with "HTTP_" (mimicking CGI's environment variables). All parameters not starting with "HTTP_" can be trusted because only the web server (= FastCGI client) can construct them.
Unfortunately, it appeared too late, and the relevant support is now far less complete than that for `X-Forwarded-*`.
I’ve rediscovered plain old CGI as a great way for users to “vibe code” custom pages on our platform. [1]
The scenario is we have our first party task lists and data viewers, but often users want to highly customize it. Say build a Kanban view or a custom dashboard with data filters and charts.
The box has a coding agent which means the user can code anything vs us building traditional report builder tools.
Go’s stdlib has good support on both the server side and user space. The coding agent makes a page-name/main.go that talks CGI and the server delegates requests to it.
It’s all “person scale” data and page views so no real need to optimize with fast CGI even.
What’s old is new again for agents!
1. https://housecat.com
Do be aware that CGI, unlike FastCGI, has a pretty big footgun due to the use of environment variables to convey HTTP headers: https://httpoxy.org/
Go's CGI server implementation doesn't set $HTTP_PROXY so you're safe from that, but I still don't love how CGI uses environment variables.
> I still don't love how CGI uses environment variables.
Neither do I. They really only make sense in the context of a request which was actually to a CGI script resident in a document root - they're an exceptionally awkward way of describing other HTTP requests, especially ones which aren't being served from a document root. And there's a lot of information lost in translation, like the order and original capitalization of HTTP headers. (Not that these things are supposed to matter, but still.)
The mystery is why uWSGI isn't more widely used. Perhaps the name does not help. It has little to do with WSGI just as FastCGI is unrelated to CGI.
It is a tiny binary protocol, with frames just as FastCGI. The reference server works with several languages, I've used it over the years mostly with Python but also Ruby and Perl. It is a small C executable with all the practical features one need for web hosting: Draining backends, autoscaling, logging, chrooted backends, everything.
Very few FastCGI servers are this mature. Unlike FastCGI, it has been extended to support websockets and async.
I have used it in production at several places for many years and have nothing but praise for it. It feels like this weird unknown secret for web operations. Unfortunately, it sees lesser use now in the cloud era, and development seems to have all but stopped. It still works and is still reliable but the writing is probably on the wall. However nothing comes close in terms of speed, simplicity, and features.
I was reading the article and thinking, I wonder what that proxy I'm using in Apache is using, it's really fast and I've had a lot of luck with it. OMG, so I've been using FastCGI all this time and had no idea, well that's awesome. :)
Does anyone remember mongrel2? A neat web server that used zeromq for backend communication (based on scgi). I always felt it had huge potential. It was also kinda famous in the early days of HN, back in the days when RoR was in vogue etc.
https://hn.algolia.com/?q=mongrel2
FCGI is also an orchestration system. It launches more server tasks when the load goes up, shuts them down when the load decreases, and launches new copies of tasks if they crash. It's like single-system Kubernetes.
> It launches more server tasks when the load goes up, shuts them down when the load decreases
In my experience, this isn't a good feature. It sounds nice, but it can often mean everything runs fine while your load is low, but when your load gets high, you spawn more workers and run out of memory. It's much better to have a static number of workers in my experience.
Crash recovery is handy, if needed though.
It's useful if you have multiple FCGI programs that handle different kinds of requests. Depending on what's being requested, programs start up and shut down.
This is exactly how we used it.
The PHP/Apache configuration that is distributed in the Red Hat family is "FastCGI Process Manager" (FPM).
I don't know if anything else in the RHEL distributions use FastCGI.
What you're looking for is mod_proxy_fcgi, not FPM. It's included in Fedora's httpd-core package; I don't know about RHEL: https://packages.fedoraproject.org/pkgs/httpd/httpd-core/fed...
I'm not looking for anything. I use this now, and it works.
I don't really know anything about the FastCGI.
Interesting.
Most of the stuff I've done for reverse proxies has been pretty straightforward and just using the stuff built into Nginx, but I have to admit that it wouldn't have even occurred to me to use FastCGI if I needed something more elaborate.
I used FastCGI a bit about ten years ago to "convert" some C++ code I wrote to work on the web, but admittedly I haven't used it much since then.
Also, embedded servers are now much much much more popular. Stuff an HTTP server directly into your application and do whatever you gotta do without gateways.
That is way! Unfortunately, sometimes you have to do path-based routing to different backends, and now you're back to needing a proxy between your clients and your applications.
This is the way only if you're operating in a trusted environment (eg. homelab, intranet) or you're sticking CloudFlare or some other "reverse proxy as a service" in front of it. If you expose an embedded HTTP app server directly to the Internet you're almost guaranteed to get pwned, as the Internet has now become an extremely hostile place.
Go's embedded HTTP server can handle it just fine: https://blog.gopheracademy.com/advent-2016/exposing-go-on-th...
These are often not enough ‘battle-tested” and come with a warning to never expose to public internet. So then you put a WAF in front of it, and you are back to HTTP reverse proxy setup.
I've always chuckled at this. Just don't used bad HTTP server libraries. I wouldn't put something like that on my intranet either.
But even if you disagree with me the point is that I can count on only one hand the number of times I went "oh man, I need a FastCGI middle end".
This seems like really bad advice or am i missing something?
Using fastcgi requires you write your app to serve fastcgi.
The upside of serving http/1.1 instead of fastcgi is that devs can instantly use their browser to test things instead of having to setup a reverse proxy on their machine.
The bad parts of http/1.1 are fixed equally well by both http/2.0 and fastcgi. So just use http/2.0 and you get the proper framing as well as browser support.
Please see the section about untrusted headers - this is not fixed by HTTP/2.
You're right that being able to point your browser right at the app is very convenient. With Go, you can have a command line flag that switches between http.Serve (for development) and fcgi.Serve (for production).
In my experience having different serving paths for dev vs production is a recipe for annoying issues. I try to make dev as similar to prod as possible.
I’m not sure, I don’t dismiss fcgi outright here, I find the arguments for it compelling (not a huge fan of http for many reasons) but it has to be really worth it to break the consistency of using http everywhere.
If you want your dev environment to be as similar to prod as possible, and you use a proxy in prod, then you should use a proxy in dev also. I was presenting a solution to someone who doesn't want to do that.
I think perhaps I was unclear. I don’t mean the entire dev environment should mirror prod (although it’s great if you can do this for end to end testing). I just mean it’s desirable if the process you’re working on operates the same way in dev as in prod.
FastCGI is theoretically better does not make it the easier choice in reality, the success of HTTP is just another case of "worse is better"
I've built a lot of API backends with Perl and FCGI::ProcManager, letting nginx (and Apache HTTPd in the past) front everything. For me it has been a pleasantly simple, incredibly robust and high-performing setup with no mess to speak of.
> Only if True-Client-IP doesn't exist does it use X-Real-IP. So even if your proxy does the right thing with X-Real-IP, you can still be pwned by an attacker sending a True-Client-IP header.
Can we just take a moment to appreciate the absurdity of HTTP headers for a moment? We have X-Forwarded-For, X-Real-IP, each CDN has their own custom flavored one. Some of them are a comma-separated list, and usually ends up having an IP of your own LB uselessly added in there (I know why, it's just not helpful). All of them might be inserted by a malicious user-agent. I guess nobody could agree on how all the various trusted servers in the pipeline should convey the important bit.
I guess it fits in quite well with the absurdity of the User-Agent header, which has come so far in absurdity that Apple decided to fully kill it by just sending utterly fake nonsense (false OS version, etc) in the name of "pRiVaCy."
I think there is a lot of merit to this argument, however, FastCGI defers to CGI/1.1 for `PATH_INFO`, etc., which is lossy as it must be URL-decoded and therefore cannot represent encoded slashes, `%2F`. (Some implementations also collapse `//` to `/` in path, but this is an issue in various HTTP implementations too.)
It is less expressive than HTTP in ways that may or may not be important to your application; I prefer accurate URL handling.
I've fought many battles with perl + windows + apache + FastCGI in a previous life. No thank you.
Indeed. I'm sure that someone will butt in with "it's just a bad implementation!" but the whole bit about allowlisting communications will cause flashbacks in those of us who had all our PUT requests just quit working on an IIS server.
> an IIS server
There's a reason the internet runs on Linux...
I‘ve had good experiences with FastCGI back when Perl was popular. These days, WebTransport is the new sexy thing. Probably not a real FastCGI replacement.
(u)WSGI must surely get a mention here?!
Then there's uwsgi protocol. It's also an RPC for basically everything.
As I understand it FastCGI doesn't handle websockets, which is a shame. It should be able to handle SSE though since that's effectively just a regular slow-loading/streaming HTTP response.
I am actually implementing a reverse proxy right now...
I am doing a typical http thing, but I wonder, has anyone used fastcgi in Caddy?
https://caddyserver.com/docs/caddyfile/directives/reverse_pr...
What I'd like to see is someone creating local caching proxy for modern https infested world. I'm fed up with downloading same packages 100 times.
There is no way to do what you ask for without having a local CA. And once you have it, any existing caching proxy will work. Now you have moved all security to the proxy, but for some applications it is worth it.
I’m in the same boat. Am trying to figure out how to configure Vinyl cache (née Varnish) in my home lab.
Can't squid do this?
Not easily. I always dread setting it up for this. I'd love to have one click solution with nice ui.
I'd love for CGI to be updated, kind of merging what works and not really caring about what does not work. Getting a .cgi file to work on Linux is really easy. Naturally you get more leverage with e. g. rails, but there is also a lot more complexity and I really hate intrinsic complexity.
CGI and FastCGI are two different things in two different domains. Well the domains are not that different but enough that CGI solves a real problem and makes sense and FastCGI does not. CGI is the interface between a HTTP transaction and a process. It answers the question "How do we turn a HTTP request into executing a process?". FastCGI answers the question of "How do we turn a HTTP request into a FastCGI request". a convolution that leaves you asking "Why are we jumping through this hoop? Is FastCGI actually bringing anything to the table?, Is it actually more difficult to have a HTTP server instead of a FastCGI server if they are so trivially connected?
I am halfway convinced the only reason FastCGI exists is we had got in a mindset that executable code in a HTTP context had to run via the Common Gateway Interface and when we wanted to to change to a persistent process model it had to have the CGI name as well. Well FastCGI to the rescue it does exactly what HTTP does but is not HTTP and most importantly has CGI in the name.
As to the articles complaint, "A HTTP relay server had a bug. Therefore HTTP is intrinsically bad". Well.. it failed to convince me. I am not exactly in that domain(backend web development) so my view is not worth much. But I feel that your internal HTTP(application) servers should be built as if they were going directly on the open web. Then you put some relay servers in front in order to block, balance and route requests. But avoid putting too many smarts in the relay servers. A smart network is almost always a bad idea. try and stick with a dumb network and smart edges.
Your question is why we have http servers and http middleware at all. Running an application directly on an endpoint makes sense for embedded application, but for everything else you will end up with a separate http terminating layer for a number of reasons.
It makes maintenance so much easier, when you can scale request backends independently of request frontends. It makes sense that the application doesn't have access to TLS keys and can't bind end user facing ports directly. The day you want a separate access log from the application log, and separate instrumentation of the frontends because of some hard to track down bug you will be thankful it is there.
So, you want to deploy some sort of proxy. Will you run FastCGI, SGI, WSGI, uWSGI, plain http, or a number of proprietary load balancer protocols? If you choose http you have to take care to wash your headers and be very careful with pipelining so front and back end expect the same behaviour (which is true for several of the proprietary protocols too). Otherwise there can be catastrophic security implications.
It is probably true that FastCGI was designed as a persistent CGI. But as soon as you have persistence, the http server needs to forward transactions to the persistent backend, and what you have built is a proxy. There is no way around it.
> Is FastCGI actually bringing anything to the table?,
Yes; it removes the need for the application server to perform parsing of HTTP, which is notoriously difficult to consistently do safely. The application server can use the safer and simpler FastCGI protocol rather than try to support the full HTTP spec.
> Is it actually more difficult to have a HTTP server instead of a FastCGI server if they are so trivially connected?
IME, yes. HTTP (the spec) is full of footguns. FastCGI has fewer footguns. HTTP requires a long dependency chain in your application. FastCGI requires maybe a single library.
> But I feel that your internal HTTP(application) servers should be built as if they were going directly on the open web.
I feel that too, but which developer do you know writes their own HTTP server inside their application? They all use the most popular server via a library or framework, almost all of which warn not to open that to the public internet.
What do you propose they do? They use a framework/library and get a warning not to expose it to the public internet, they don't use a framework/library and odds are good that they coded some vulnerability into it.
Embedding http server (or fcgi one for that matter) is trivial in most languages and also makes local development simpler. CGI was horrible idea back then and it continues to be