1. What is Flopsar

What is the most important thing in systems maintenance? Systems maintenance relies on making the systems highly available and responsive. This implies that we must maintain the number of outages as low as possible and the time needed to resolve any issues must be as short as possible. However, if the worst happens and the system fails, besides the fact it failed we need to know why. Knowing why is not sufficient if finding the cause lasts for a whole day or even longer. These are the main problems which need to be addressed.

Flopsar is a Fault Detection and Diagnosis software for Java systems. Its main goal is to detect errors and problems in your Java systems and help to find the root cause quickly. Unlike other APM-like tools, Flopsar does not require users to predict what may go wrong in their systems. Moreover, we do not lose information about your business processing by collecting only some vague averages and aggregated data. Flopsar is highly customizable and extensible product. It does not adhere to any specific Java applications, it just needs JVM.

How do we help users to find the root cause of problems, actually? Flopsar GUI application has a novel feature, called Galaxy view. This is a system overview, which enables users to observe any suspicious behavior. Usually, when something goes wrong in the system it is clearly visible in the Galaxy view. It is manifested by some anomaly patterns emergence. Users can quickly investigate what causes such patterns. In fact, by two-step drill down users are able to know the root cause of the problems represented by these patterns very quickly.

The whole idea standing behind our Galaxy view is simply a human brilliant feature of pattern recognition. It is so powerful, that the more users work with the Galaxy view, the faster he or she is able to find the root cause of the problem. If users are experienced enough they do not have to investigate deeper, they just know what goes wrong only by the pattern itself!

Flopsar has more key features which helps users to maintain their systems better. Its area of application is not only limited to systems maintenance, it is also used in applications development processes successfully. Its features can be extended by using so-called formatters, which enable users to perform extra processing at runtime, with no application source code changes.

Note

Flopsar ® is a trademark of Flopsar Technology Sp. z o.o., registered in the EU.

2. Versioning

Every version label is composed of three numbers X.Y.Z:

X
Major version number: major changes, e.g. architecture change.
Y
Minor version number: minor changes, e.g. new functionalities.
Z
Small version number: bugs fixes.

Note

Versions X.Y are equivalent to X.Y.0.

You should not mix binaries form different releases, since they are usually not compatible. You can mix only those binaries, which have the same both major and minor release numbers.

Warning

Incompatible components are unable to work together. Any connection between two incompatible components is dropped.

3. Releasing

Flopsar major/minor (X.Y) versions are released once a year. Bug fixes releases (X.Y.Z) do not follow this rule, which means they can be released more often.

4. Licensing

Flopsar is a proprietary software and you need a valid license to run it. The license is a license.key file, which should be obtained separately. There are some restrictions on the Flopsar usage depending on the license type you purchased.

5. Architectural Fundamentals

Flopsar consists of four main components, i.e. Agent, Manager, Database and Workstation. There is one extra component Flopsar Database Connector (fdbc), which allow to access Flopsar environment.

_images/architecture.svg

Fig. 5.1 Flopsar Architecture

The picture above shows the Flopsar architecture of the entire environment. There can be only one manager instance per environment. The rest of the components can run in multiple instances. The links denote all the TCP connections in the environment. The arrows denote data flow direction. Almost all of the connections are bidirectional except the ones from agent to database instances.

5.1. Networking

There are three TCP servers running in the Flopsar environment. In order to run the environment successfully, you must prepare your network so that it allows all the required connections to be established.

_images/networking.svg

Fig. 5.2 Flopsar Environment Networking

The servers addresses are configured in the corresponding configuration files of each relevant component (see Configuration and Configuration for details). Each FDBC client establishes two TCP connections per single database instance. The first connection is responsible for providing online data, while the second one for handling user queries.

5.2. Agent

The agent is a single jar file flopsar-agent-2.4.jar. It is written entirely in Java and is based on ASM instrumentation framework. It was designed with performance in mind, so that it has a minimal impact on the JVM performance. It does not perform any processing, it just collects the data and sends them as soon as possible.

When a JVM starts, the agent connects to the manager instance. It is the agent that initiates a connection to the manager, i.e. there is no agent internal server, waiting for connections. When the agent establishes its connection to the manager, it starts a separate thread which opens a channel for incoming manager messages. The communication between agent and manager is bidirectional. If the agent loses its connection to the manager it will try to reconnect automatically.

Important

If you want to change the agent file name, you should also change the value of Boot-Class-Path entry in the manifest file of the agent jar accordingly.

5.3. Manager

manager is a central point of the entire environment. It stores and serves Agent configurations, manages users authentication, forwards data from agents to fdbc clients, etc. It is a single point of failure because there is only one instance of it per environment. However, when the environment is set, i.e. agents configurations are deployed and agents are attached to their databases, manager is not required as long as no configuration changes or data query operations are needed. In other words, if manager is down the data acquisition process goes on.

_images/manager.svg

Fig. 5.3 manager Processes

manager operates two processes, the MASTER process and its child process: FSM. If a child process crashes MASTER will spawn a new one immediately. manager runs a single TCP server, which accepts connections from three types of clients: agent, database and fdbc. There can be multiple Flopsar environments running on the same server machine as long as they use their own license files with distinct IP addresses.

Important

License file cannot be shared between different Flopsar environments that run on the very same machine.

5.4. Database

database operates four processes: MASTER, FDB, FPU and ADM. When database starts it creates its MASTER process, which spawns the other processes. Whenever one of the child processes crashes (except ARCH), MASTER respawns it immediately. There is an additional child process ARCH, which is spawned periodically to do its job. When the job is done it is closed.

_images/database.svg

Fig. 5.4 database Processes

Each child process has its own area of responsibility:

MASTER
This process is responsible for watching its children mainly.
FDB
This process runs a TCP server, which receives data from agents. It is responsible for data persistence.
FPU
This process runs a TCP server, which handles fdbc requests.
ADM
This process runs a TCP client, which communicates with manager.
ARCH
This process is responsible for data archiving (see Data Archiving for details). It is spawned if and only if the archive feature is enabled in the configuration file.

database is a native application and it supports platforms specified in Supported Platforms and Requirements. The memory and CPU requirements strongly depend on how much data the database will be receiving and how many clients will be handling.

database is designed to take advantage of multiple CPU cores. There are internal tasks that can run in parallel provided that there are enough CPU cores. These tasks are responsible for collecting and persisting data. Every four additional CPU cores enable to run a new, additional task. How many tasks database will run is decided at the database initialization step and cannot be changed later. So if you can provide a multiple cores CPU, do it before you initialize a database environment.

database collects received method calls in so-called collectors in memory. The collector is assigned to each agent connection and its size depends on methods instrumentation and your application. 8 GB of RAM is a reasonable minimum you should provide.

5.5. Workstation

It is written in Kotlin and based on the JavaFX platform.

6. Supported Platforms and Requirements

agent is written entirely in Java and supports JVM 1.6, 1.7, 1.8 and 9. workstation is written in Kotlin and requires JVM 1.8u131+. It can be run only on those platforms which are supported by JavaFX.

Virtual Machine Version
JVM 1.8u131+

Important

workstation does not support JVM 9.

Both manager and database are native executables and they can be run only on the platforms listed below:

Operating System Architecture
GNU/Linux 2.6.33+ x86_64

Important

Native components do not support 32bit platforms.

7. Distribution Package

Flopsar distribution binaries include:

flopsar-2.4-Linux.deb
Flopsar manager and database Debian package.
flopsar-2.4-Linux.rpm
Flopsar manager and database RPM package.
workstation-2.4.zip
Flopsar Workstation application.
flopsar-agent-2.4.jar
Flopsar Agent for JVM 8 and earlier.
flopsar-agent-2.4-jvm9.jar
Flopsar Agent for JVM 9+.
EULA
End User License Agreement
SHA256
sha256 checksums of all the binaries.
SHA256.sig
PGP signed SHA256 file. In order to verify the signature use our PGP public key. You can find it at http://flopsar.com/pgp.

8. Release Notes

Important

Flopsar 2.2 and earlier versions are no longer supported.

Changes in workstation

  • GC graphs removed from agent details view.
  • Start page redesigned.
  • Optimizations.

Changes in database/manager

  • Parameters compression algorithm changed to LZ4.
  • Internal authentication embedded in the manager.
  • Optimizations.

Changes in fdbc

  • New C/C++ API for accessing database data.

8.1. How to upgrade from Flopsar 2.3

In order to upgrade to 2.4 you must replace each 2.3 component with the new one.

Important

License keys have been changed. Please, request a new license key before upgrade. Your current license files will not work with Flopsar 2.3.

If you already have Flopsar 2.3 installed and running, follow the instructions below to upgrade your environment:

  1. Stop your manager and databases.

  2. Install Flopsar 2.4 version binaries.

  3. Migrate your current database to a new version. Before you start migrating, make a copy of your current database files. In order to migrate your database files, execute the following command:

    $ fs2db start --action migrate <database_home>
    
  4. If you use Flopsar internal authentication, comment out the plugins option in the manager configuration file.

  5. Upgrade agents and workstation.

  6. Start the manager and databases.