In April 2022, Oracle released a new feature called Database Service Events for the ExaCC platform. This feature collects telemetry data of VM cluster nodes and sends a notification when a health issue is detected.
Table of Contents
Enable Diagnostics Notification
Per default, the feature is deactivated for a VM cluster. To enable it, the OCI web console or the OCI CLI can be used.
Update dbaascli to version 22.3.1.0.1 (220831) or higher to fix an issue that leads to the generation of huge amounts of incident files in the /var/opt/oracle/log/syslens/sysLens/sysLens/incident directory.
OCI Web Console
Navigate to the details page of your VM cluster and click on the link Enable of the Diagnostics Notification element in the General Information area.
After the popup dialog is confirmed, the work request is started. The runtime of the work request is around about 2 minutes.
Components
Database Service Agent
Ensure that the Database Service Agent is running.
$> systemctl status dbcsagent.service
* dbcsagent.service
Loaded: loaded (/usr/lib/systemd/system/dbcsagent.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2022-09-12 15:22:52 CEST; 1 weeks 4 days ago
Main PID: 24498 (bash)
CGroup: /system.slice/dbcsagent.service
|-24498 /bin/bash -c umask 077; /bin/java -Doracle.security.jps.config=/opt/oracle/dcs/agent/jps-config.xml -jar /opt/oracle/dcs/bin/dbcs-agent-*.jar server /opt/oracle/dcs/conf/dcs-agent.json >/opt/oracle/dcs/l...
`-24506 /bin/java -Doracle.security.jps.config=/opt/oracle/dcs/agent/jps-config.xml -jar /opt/oracle/dcs/bin/dbcs-agent-22.2.1.1.0_220713.1149.jar server /opt/oracle/dcs/conf/dcs-agent.json
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
Trace File Analyzer (TFA)
Ensure that the Trace File Analyzer is running on all VM cluster nodes.
$> tfactl status
.-------------------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+-----------------+---------------+-------+------+------------+----------------------+------------------+
| exaccnode01 | RUNNING | 24134 | 5000 | 22.1.1.0.0 | 22110020220516195917 | COMPLETE |
| exaccnode02 | RUNNING | 24080 | 5000 | 22.1.1.0.0 | 22110020220516195917 | COMPLETE |
'-----------------+---------------+-------+------+------------+----------------------+------------------'
Oracle System Resource Analysis (sysLens)
SysLens is used to gather the telemetry data of each VM cluster node. During the processing of the work request, the service is enabled and started. The daemon is automatically restarted every 6 hours.
$> systemctl status syslens
* syslens.service
Loaded: loaded (/etc/systemd/system/syslens.service; enabled; vendor preset: disabled)
Active: active (running) since Sat 2022-09-24 12:33:00 CEST; 51s ago
Process: 333885 ExecStopPost=/var/opt/oracle/syslens/bin/syslens --stop (code=exited, status=0/SUCCESS)
Main PID: 333897 (python3)
Memory: 30.8M
CGroup: /system.slice/syslens.service
`-333897 /usr/bin/python3 /var/opt/oracle/syslens/bin/syslens_main.py --archive /var/opt/oracle/log/syslens --config /var/opt/oracle/syslens/data/exacc.syslens.config --daemon --direct /var/opt/oracle/syslens/data...
Sep 24 12:33:00 fsdebsq15vd0001 systemd[1]: Started syslens.service.
The daemon is using the configuration file /var/opt/oracle/syslens/data/exacc.syslens.config. To validate if the collection of the required telemetry data is activated, run the following command.
$> /usr/bin/syslens --config /var/opt/oracle/syslens/data/exacc.syslens.config --get-key enable_telemetry
syslens Collection 2.3.3
on
It is possible to run sysLens manually by using the following command.
$> syslens --config /var/opt/oracle/syslens/data/exacc.syslens.config.
Notifications
The OCI Notifications and the OCI Event Service are used to implement the sending of email notifications when an issue is detected.
Topic
To receive an email notification, a Topic needs to be created. Afterward, users can subscribe to this Topic. Navigate to Developer Services > Application Integration > Notifications and press the Create Topic button.
After the Topic is created, open the details page and click on Create Subscription.
To will receive an email to confirm your subscription.
Rules
Now it is time to configure Rules for the events of interest. Currently, the following Database Service Event Types are supported.
- Database – Critical
- DB Node – Critical
- DB Node – Error
- DB Node – Warning
- DB Node – Info
- DB System – Critical
Navigate to Observability & Management > Events Service > Rules and click on Create Rule. To demonstrate this feature, I will use the event type DB Node – Critical.
Test
To test the configuration, you can simulate a mount point running out of free space, by creating a huge file using dd.
$> dd if=/dev/zero of=/u02/app/data.dat bs=1M count=60000
60000+0 records in
60000+0 records out
62914560000 bytes (63 GB) copied, 216.694 s, 290 MB/s
$> df -h /u02
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VGExaDbDisk.u02_extra.img-LVDBDisk 148G 135G 6.0G 96% /u02
You should receive an e-mail notification with details about the incident.