Web Architect - An Introduction

Rédigé par Antoine Gicquel - 06/09/2023 - dans Pentest - Téléchargement

This article is the first of a series detailing various security aspects of the most common technologies one can encounter on the web, starting with CMSs. As of today, most of the Content Management Systems (CMS) market shares are detained by PHP based solutions (WordPress accounting for most of it, admittedly). Thus, they are really common to find during web pentest engagements. This article and the following ones will tell you everything you need to know to get started when facing one of them, by studying two of the most common ones, namely WordPress and Magento. While they are totally different products and do not share any code, some similarities can be observed in their architecture. Let’s go over them from a high-level view in this first article.

Next in the series

 

CMS 101

What is a CMS ?

A CMS, or Content Management System, is a web-based software application that allows users to create, manage, and modify digital content on a website without requiring extensive technical knowledge. Its main function is to keep website content and design separated from the code and infrastructure, and to provide a user-friendly interface that simplifies the process of content creation and publication.

CMSs represent a really attractive solution for individuals or businesses which cannot spend the time or money to build and maintain a completely custom-built website, implementing features like content creation and editing through an intuitive dashboard, user management, design customization and Search Engine Optimization (SEO) friendliness.

Popular CMSs include WordPress, Drupal, Joomla, and Magento, each with its own strengths and target audiences. Organizations and individuals choose a CMS based on their specific needs, such as the type of website they want to create (e-shop, blog, digital artifact management...), the level of customization required, and the technical expertise available.

 

How does a classic CMS work ?

Users and roles

On a classic CMS, there are typically different types of users with varying levels of access and permissions:

  • Administrator: they have the highest level of access and control over the CMS. An administrator can manage all aspects of the website, including user accounts, site settings, plugins/themes, content management, and overall website configuration. Administrators can create and modify user roles, assign permissions.
  • Author/Editor: registered users on the backend part of the website. They have the ability to create, edit, and publish content on the customer-facing part of the website. While they have access to the backend panel, they cannot usually manage user accounts and permissions.
  • Registered User: users who have created an account on the customer-facing side of the website. They may have access to additional features like commenting or customizing a user profile. Their permissions are generally limited to interacting with the customer-facing part of the website and their own profile settings.
  • Guest: users who access the website without creating an account or logging in. They can view public content, browse pages, and interact with certain features that are available to non-registered users, such as submitting contact forms, accessing publicly accessible resources and using the e-shop for e-commerce websites.

 

Code paths and Routing

When a request is received by a CMS, the code execution typically goes in the following order. The entry point file (index.php for a PHP CMS as an example) is loaded, which includes the necessary code and configuration files to bootstrap the CMS environment, variables, classes and functions. The routing system then determines the appropriate controller or action based on the requested URL, using a mapping configuration. The controller is then invoked to handle the request, performing operations and interacting with models, databases or other components as needed. Once the necessary data is obtained, the controller prepares it to be rendered and passes it to the view layer. Views, which are usually templates containing HTML, CSS, and dynamic placeholders, are then rendered using a templating engine. The engine replaces the placeholders with the actual data, resulting in the generation of HTML output.

 

Code path of an usual request
Code path of a usual request

 

Sessions and authentication data is usually handled during the bootstrapping phase, setting the right variables in the CMS environment. It is then to the responsibility of the controller to check whether the user is authenticated (through the previously set CMS variables) and to enforce ACL rules.

 

DB schema

In a CMS, the database is typically used to store various types of data that are essential for managing and displaying content on the website, namely:

  • Website content: includes articles, blog posts, pages, and other types of content, with fields like title, content body, author, publication date, status, and associated metadata.
  • User data: includes username, hashed password, email, user roles, and profile data.
  • Settings and configuration: stores information for CMS operation and customization. Includes site settings, email configuration, SEO settings, …
  • Audit data and logs: tracks actions within the CMS (user activity, content modifications, system events).

 

Cron

CMSs often utilize a cron-like mechanism to perform scheduled tasks and automate certain operations like cache management, fetching CMS and plugins updates, sending emails, generating a sitemap, creating a backup, cleaning the database…

Administrators or developers register these tasks by specifying code, execution intervals, and parameters. The CMS provides instructions for setting up a cron tab on the server’s operating system, which acts as a system-level scheduler and which will call the CMSs scripts at a defined interval, to run the specified tasks. Some CMSs provide a logging mechanism of these cron jobs, so that an administrator has an easier time debugging in case of failure.

Plugins

Plugins are a central feature of CMSs that allows administrators to add features and customize their websites. These plugins do not modify the core files and folders, and only come as additional files and classes, loaded from their designated folder on each request. They offer a wide range of features, such as adding new content types, integrating with third-party services, improving search engine optimization, enhancing security, and implementing custom functionality. By utilizing plugins, CMS users can tailor their websites or applications to their specific needs, without the need for extensive programming knowledge or making significant changes to the CMS's core code. However, although there may be a verification phase before a plugin is available on the CMS store, the security of these is much less guaranteed than that of the core codebase, since they undergo far fewer security audits and are not maintained by the CMS core team. Thus, plugins and custom code will very often be your primary source of vulnerabilities in a CMS-built website.

 

Security measures

Post-exploitation - RCE as a feature ?

Installing a malicious plugin/theme / Modifying an existing plugin/theme

As you probably know, some CMSs (WordPress, Drupal, Joomla and others) offer the ability to upload plugins for administrators. This mechanism provides a clear way of executing arbitrary code on the underlying system1 (server / container / ...) by crafting a malicious plugin acting like a web shell and installing it on the target instance. However, it is possible that the account you compromised does not have the right to upload plugins, but don't worry, you are not short of alternatives. Maybe you could modify an existing plugin, loaded on each request, to include a backdoor ? Or maybe you are allowed to upload or modify a theme ? The choice is yours. However, keep in mind that if you have a bug in your code and your code is executed on each request (for instance on theme loading), you might end up locking yourself and others out of the website without a way to revert your changes (except with an intervention on the server directly), every request resulting in a 500 Internal Server Error.

This arbitrary code execution is a feature here, and can be seen as a severe security flaw to the CMS. That is why, in recent years, some CMSs (e.g. Magento) have removed the ability to upload and modify themes and plugins from the administrator dashboard and tried to block potential escape mechanisms to the underlying context.

 

Using a regular file upload feature

It is common for CMSs to allow administrators to upload files. An easy win could consist in uploading a file which the server will treat as code, and then accessing it in the uploaded resources directory. If the CMS does not allow you to upload an interpretable file, there are still a couple of tricks that you could try to get your code execution, e.g. uploading a .htaccess file configured to parse a custom extension with the server engine, and then upload a file with the previously set custom extension.

 

Convenient security methods

The nice part about it

Usually, CMSs provide a range of valuable protections and security features, making them a reliable choice for website owners concerned about safeguarding their data. One of the primary advantages is the secure user authentication and access control list (ACL) systems. CMSs allow administrators to manage user roles and permissions, ensuring that only authorized individuals can access sensitive areas of the website or perform certain actions, with a proper password storage policy.

CMSs also take the burden of sessions and cookies management off webmasters’ minds, providing a protection against session hijacking attacks. Furthermore, CMSs incorporate measures to mitigate common web vulnerabilities, such as XSS and SQL injections by providing developers with functions and classes to escape and sanitize untrusted data, like WordPress’ sanitize_text_field method.

Other features like automatic core and plugins updates, proper file handling and logging of sensitive actions also help towards raising the overall security level of a website built using a CMS.

 

The less nice part

First, let’s go over the obvious: these security measures will never protect a lazy webmaster from using password123! as their password, thus compromising the entire security of the website.

Then, as stated by Shreya Pohekar & Sheeraz Ali in their NullCon 2022 talk2, this feeling of security provided by the CMS can be misleading. Indeed, while a developer can keep in mind the mantra “Always escape user data”, most CMS provide at least half a dozen escape_* / sanitize_* functions, each one dealing with a slightly different set of characters and serving a specific purpose.

Non-exhaustive list of WordPress escaping and sanitizing methods (43 methods)
  • sanitize_email()
  • sanitize_file_name()
  • sanitize_hex_color()
  • sanitize_hex_color_no_hash()
  • sanitize_html_class()
  • sanitize_key()
  • sanitize_meta()
  • sanitize_mime_type()
  • sanitize_option()
  • sanitize_sql_orderby()
  • sanitize_term()
  • sanitize_term_field()
  • sanitize_text_field()
  • sanitize_textarea_field()
  • sanitize_title()
  • sanitize_title_for_query()
  • sanitize_title_with_dashes()
  • sanitize_user()
  • sanitize_url()
  • esc_attr()
  • esc_html()
  • esc_js()
  • esc_textarea()
  • esc_sql()
  • esc_url()
  • esc_url_raw()
  • wp_kses()
  • wp_kses_array_lc()
  • wp_kses_attr()
  • wp_kses_bad_protocol()
  • wp_kses_bad_protocol_once()
  • wp_kses_check_attr_val()
  • wp_kses_decode_entities()
  • wp_kses_hair()
  • wp_kses_hook()
  • wp_kses_html_error()
  • wp_kses_js_entities()
  • wp_kses_no_null()
  • wp_kses_normalize_entities()
  • wp_kses_post()
  • wp_kses_split()
  • wp_kses_stripslashes()
  • wp_kses_version()

 

As such, it can be misleading and developers, while having the right idea in mind and following security best practices, might end up writing vulnerable code because of a misuse of a security function. Shreya Pohekar & Sheeraz Ali take the example of a code which escaped user input using sanitize_text_field, a WordPress method which only sanitizes quotes and which does not protect the application from XSS payloads. This user input was then directly reflected in the content of the page, and while a developer could think “I’ve sanitized this user input, I can now display it safely”, the sanitization process was not adequate with the context in which the variable was used, leading to an XSS vulnerability.

In the same vein, what would you think of a function provided by your favorite CMS, named is_admin() ? As CMSs usually handle user authentication and ACL rules, surely it tells you if the request comes from an authenticated administrator, right ? Wrong ! According to the WordPress documentation3:

is_admin(): bool

Determines whether the current request is for an administrative interface page.

So this function will always return true on a supposedly protected page… The correct ACL checking is done via the current_user_can() WordPress function, which is linked a few lines below on this same documentation page. The main takeaway here is: you cannot trust your intuitions nor function names when writing or auditing some CMS code. Reading the documentation can save you from many biased shortcuts which could lead to serious vulnerabilities.

 

Conclusion

Understanding the inner workings of CMSs is vital for pentesters when assessing the security of a website built with such software. While the information presented here is sufficiently general to apply to most CMSs out there, keep in mind that every CMS is different and serve a specific purpose, with its own architecture and specificities. The following articles will dig deeper in two of the most popular ones, Magento and WordPress, first with a detailed presentation of each CMS specificities and then focusing on the template engine for Magento, sprinkled with some tips for security research on these projects.