ExaCC: Database Service Events

In April 2022, Oracle released a new feature called Database Service Events for the ExaCC platform. This feature collects telemetry data of VM cluster nodes and sends a notification when a health issue is detected.

Enable Diagnostics Notification

Per default, the feature is deactivated for a VM cluster. To enable it, the OCI web console or the OCI CLI can be used.

Update dbaascli to version 22.3.1.0.1 (220831) or higher to fix an issue that leads to the generation of huge amounts of incident files in the /var/opt/oracle/log/syslens/sysLens/sysLens/incident directory.

OCI Web Console

Navigate to the details page of your VM cluster and click on the link Enable of the Diagnostics Notification element in the General Information area.

General Information Area of VM Cluster

After the popup dialog is confirmed, the work request is started. The runtime of the work request is around about 2 minutes.

Enable Diagnostics Notification

Components

Database Service Agent

Ensure that the Database Service Agent is running.

$> systemctl status dbcsagent.service
* dbcsagent.service
   Loaded: loaded (/usr/lib/systemd/system/dbcsagent.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2022-09-12 15:22:52 CEST; 1 weeks 4 days ago
 Main PID: 24498 (bash)
   CGroup: /system.slice/dbcsagent.service
           |-24498 /bin/bash -c umask 077; /bin/java  -Doracle.security.jps.config=/opt/oracle/dcs/agent/jps-config.xml -jar  /opt/oracle/dcs/bin/dbcs-agent-*.jar server /opt/oracle/dcs/conf/dcs-agent.json >/opt/oracle/dcs/l...
           `-24506 /bin/java -Doracle.security.jps.config=/opt/oracle/dcs/agent/jps-config.xml -jar /opt/oracle/dcs/bin/dbcs-agent-22.2.1.1.0_220713.1149.jar server /opt/oracle/dcs/conf/dcs-agent.json

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

Trace File Analyzer (TFA)

Ensure that the Trace File Analyzer is running on all VM cluster nodes.

$> tfactl status

.-------------------------------------------------------------------------------------------------------.
| Host            | Status of TFA | PID   | Port | Version    | Build ID             | Inventory Status |
+-----------------+---------------+-------+------+------------+----------------------+------------------+
| exaccnode01     | RUNNING       | 24134 | 5000 | 22.1.1.0.0 | 22110020220516195917 | COMPLETE         |
| exaccnode02     | RUNNING       | 24080 | 5000 | 22.1.1.0.0 | 22110020220516195917 | COMPLETE         |
'-----------------+---------------+-------+------+------------+----------------------+------------------'

Oracle System Resource Analysis (sysLens)

SysLens is used to gather the telemetry data of each VM cluster node. During the processing of the work request, the service is enabled and started. The daemon is automatically restarted every 6 hours.

$> systemctl status syslens
* syslens.service
   Loaded: loaded (/etc/systemd/system/syslens.service; enabled; vendor preset: disabled)
   Active: active (running) since Sat 2022-09-24 12:33:00 CEST; 51s ago
  Process: 333885 ExecStopPost=/var/opt/oracle/syslens/bin/syslens --stop (code=exited, status=0/SUCCESS)
 Main PID: 333897 (python3)
   Memory: 30.8M
   CGroup: /system.slice/syslens.service
           `-333897 /usr/bin/python3 /var/opt/oracle/syslens/bin/syslens_main.py --archive /var/opt/oracle/log/syslens --config /var/opt/oracle/syslens/data/exacc.syslens.config --daemon --direct /var/opt/oracle/syslens/data...

Sep 24 12:33:00 fsdebsq15vd0001 systemd[1]: Started syslens.service.

The daemon is using the configuration file /var/opt/oracle/syslens/data/exacc.syslens.config. To validate if the collection of the required telemetry data is activated, run the following command.

$> /usr/bin/syslens --config /var/opt/oracle/syslens/data/exacc.syslens.config --get-key enable_telemetry

syslens Collection 2.3.3

on

It is possible to run sysLens manually by using the following command.

$> syslens --config /var/opt/oracle/syslens/data/exacc.syslens.config.

Notifications

The OCI Notifications and the OCI Event Service are used to implement the sending of email notifications when an issue is detected.

Topic

To receive an email notification, a Topic needs to be created. Afterward, users can subscribe to this Topic. Navigate to Developer Services > Application Integration > Notifications and press the Create Topic button.

Create Topic

After the Topic is created, open the details page and click on Create Subscription.

Create Subscription

To will receive an email to confirm your subscription.

Rules

Now it is time to configure Rules for the events of interest. Currently, the following Database Service Event Types are supported.

  • Database – Critical
  • DB Node – Critical
  • DB Node – Error
  • DB Node – Warning
  • DB Node – Info
  • DB System – Critical

Navigate to Observability & Management > Events Service > Rules and click on Create Rule. To demonstrate this feature, I will use the event type DB Node – Critical.

Create Rule

Test

To test the configuration, you can simulate a mount point running out of free space, by creating a huge file using dd.

$> dd if=/dev/zero of=/u02/app/data.dat bs=1M count=60000
60000+0 records in
60000+0 records out
62914560000 bytes (63 GB) copied, 216.694 s, 290 MB/s
$> df -h /u02
Filesystem                                      Size  Used Avail Use% Mounted on
/dev/mapper/VGExaDbDisk.u02_extra.img-LVDBDisk  148G  135G  6.0G  96% /u02

You should receive an e-mail notification with details about the incident.

References

Leave a Reply

Your email address will not be published. Required fields are marked *