I spent 32 years in IT with major corporations, clearly, the Facebook infrastructure is seriously flawed. For a company completely dependent on Internet presence as their sole business model they should be on a virtually failsafe infrastructure. If Amazon never goes down then Facebook should be on AWS infrastructure but they thought they knew better. Obviously, the organization is also seriously flawed but of course, we already knew that.
Well you showed me something! But was Amazon down totally or just a higher number of reported problems? I seem to remember one of the hacker groups throwing in the towel a few years ago because they simply could not overwhelm Amazon.
If I were the CIO at Facebook I would have minimum 3 redundant servers farms at each site and then at least 6 worldwide IT sites with the ability to host all the traffic at any one site perhaps with performance hits but definitely not an outage. Facebook has more than enough money to build massive redundancy and extreme availability with failover almost instantaneous to another IT site. In today's world highly available and massively parallel application servers are known technologies as are the data replication and the failover technologies.
They have just not decided that they want to pay for true high availability and redundancy and all of the testing required to continuously validate the architecture and resources. Their objectives are very straight forward maximum Availability and Security. For that, you hire the best and you pay for the required infrastructure.