Panorama of ESM Tools: HP OVO

Showing posts with label HP OVO. Show all posts

Tuesday, September 23, 2014

Remedy SPI - Config file & Rules Files

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEir2h3W13iKd39paIDNG5LrdmKAHfMGZ1Hz3gHaU-MB8Rjmdqz0pypv5Z8Czegy41ESXCcK6IcDvhyphenhyphenI3mGZx018YPDNgnIKywrO0gSCOfa5sv4M6ekgyaQfxVn0UgWDlNIIMiKZaDfKWic/s1600/Rotary+Engine.jpg

HP OM is an Event Management tool. For the complete ITIL workflow this has to be integrated with any one of the ITSM tools available in the market.

We have used Remedy Smart Plug-in(remspi) by HP to integrate HP OVO with BMC Remedy Action Request System (ARS) in our environment.

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSA-FH1bixT_rIff8i3L0MG72xMF4cKmdZZFOBO9noLM5KNWHnU-zFqJPozV6tvmexyxwF4ekOoyKaRLLcjhCE8tpxufpUND4HYS-_4qMb95GT5iug8CG3wmFGFs1byJNczznRwQln1G4/s1600/Capture_remedy.JPG

The above diagram shows a basic flow within the integration.

This post expects the reader to be well versed with HP OVO and Remedy ARS. Also refer to the SPI_RARS_AdminRef.pdf for more information.

This post gives a brief example of the cfg file and the rules file.

1. The SPI for Remedy ARS server receives the ID of an HPOM

message from the HPOM management server via a submission

program.

2. The SPI for Remedy ARS server retrieves details of the HPOM

message from the HPOM management server via an application

programmer interface (API).

3. The SPI for Remedy ARS server creates and updates action requests

using the ARS API.

4. When an action request changes, the action-request system calls the SPI for Remedy ARS client and passes on the details of the update via the ARS filter mechanism.

5. The SPI for Remedy ARS client sends the update details to the SPI for Remedy ARS server.

6. The SPI for Remedy ARS server updates the HPOM message on the HPOM management server using HPOM’s API.

7. When an HPOM message is modified, the SPI for Remedy ARS server receives a message-change event via the HPOM API.

8. HPOM sends the details of the message-change event to the SPI for

Remedy ARS server via the HPOM API.

9. The SPI for Remedy ARS server updates the action request by means of the Remedy-ARS application-programming interface (API).

The data components that are used in the data flow are:

External Actions and Data
Configuration file - remspi.cfg
Rules File - remspi_rules.txt
The SPI for Remedy ARS database

Configuration File - /etc/opt/OV/share/conf/remspi/remspi.cfg

The configuration file remspi.cfg defines the user name and passwords that the SPI for Remedy ARS uses to log in to HPOM. The passwords are encrypted.

There has to be a user in HPOM with appropriate permissions in HPOM.

Use remspipasswd to encrypt the password of this user.

remspipasswd <new_user_password>

Example of a configuration file :

# File: remspi.cfg

# Description: Configuration file for Remedy(tm) SPI

# Package: HP Operations SMART Plug-In for

# Remedy(tm) Action Request System(r)

# Note: Value must be on same line as keyword

REMSPI_ITO_USER remspi_server

REMSPI_ITO_PASSWD C338D5F21219E076C2000C45AA0475925A1

REMSPI_ITO_ADMIN_PASSWD C338D5F21219F076C2000C4VAA0475925A1

Rules File: /etc/opt/OV/share/conf/remspi/rules

Sample Rules file: This is an example from my test environment

SYNTAX_VERSION 2.4

SPI_RULES "SPI Rules - ProblemReport"

DESCRIPTION "Example rules for the Problem Report System"

TARGET "Probsys"

SCHEMA "Problem Report"

USER "ovouser"

SERVER "REMEDY_PRODSERVER"

BACKUPUP_SERVER "REMEDY_BACKUPSERVER"

PASSWORD "KMIDDKLFN9R43093LNSVNDIFDFLN49RRLADMC1233NDS"

FIRST FIELD 2 "$ITO_SUBMITTER$"

FIELD 7 "New"

FIRST FIELD 8 "$MSG_TEXT$"

UPDATE OVERWRITE FIRST FIELD 8000000078 "$MSG_TEXT$"

FIRST FIELD 8000000079 "\\n$MSG_ANNO$\\n"

UPDATE OVERWRITE FIRST FIELD 8000000080 "$MSG_SEVERITY$"

FIRST FIELD 8000000081 "$MANAGEMENT_SERVERS$"

FIRST FIELD 8000000082 "$NODE_NAME$"

FIRST FIELD 8000000083 "$NODE_APPL$"

FIRST FIELD 8000000084 "$MSG_GRP$"

FIRST FIELD 8000000085 "$MSG_OBJ$"

FIRST FIELD 8000000086 "$MSG_ID$"

FIRST FIELD 8000000087 "$Instructions$"

UPDATE OVERWRITE ALL FIELD 8000000088 "$ACK_USER$"

FIRST FIELD 8000000089 "$EXTERNAL Severity$"

ON_SUBMIT

OWN #OVO message is owned after creation

MSGTXT_PREFIX "ARS-ID=$ARS_ID$: "

ITO_UPDATE

ANNOTATE 7 "Closed" "Action request closed by $5$ on $6$"

FORCE OWN 7 "Assigned"

OWN 7 "Fixed"

DISOWN 7 "Rejected"

ACKNOWLEDGE 7 "Closed"

UNACKNOWLEDGE 7 "Reopened"

OP_ACTION "operator-action"

ESCALATE "escalate"

ANNOTATE "Action request modified"

Latest work log entry: $8000000079$

MSGCONDITIONS

DESCRIPTION "Catch All"

CONDITION

SET TARGET "Probsys"

FIELD 8000000083 "$NODE_APPL$"

SUPPRESSCONDITIONS

DESCRIPTION "Condition Suppress"

CONDITION

MSGGRP "Printers"

Severity "Warning"

A little bit of explanation of the rules file

8000000080 all these numbers relate to the field id in the REMEDY ARS form that gets called when remspi gets invoked. Here the form is called "PROBLEM REPORT"

MSG_SEVERITY --> 8000000080 is the Severity of the alert that comes in OM

EXTERNAL severity --> 8000000089 is the severity chosen by the operator (DC-ops uses semi-automation to create tickets)

8000000079 "\\n$MSG_ANNO$\\n"--> this goes into the work log of the ticket

MSGCONDITIONS are the conditions that have to satisfy for a ticket to be created.

SUPPRESSCONDITIONS are the conditions which have to be true for a message to be filtered out.

Sometimes you might have issues when the RemedySPI process doesn't start with ovstart. This behavior seems to be normal. The remspisrv is integrated into ovstop/ovstart through the remspisrv.lrf file. ovstop / ovstart is the official way to stop and start. ovstop stops RemedySPI, but ovstart starts the OVO processes but ovspmd doesn't know the status of the previous RemedySPI and doesn't start it again. You would have to perform “ovstart RemedySPI” to start it.

Another issue that I have faced with remspi is that remspiconfig and remspifilter produce the following errors.

Can't lock DBM database files '/var/opt/OV/share/tmp/remspi/Probys'

for target 'Probsys'. Permision Denied (SPI215-45)

Can't lock DBM database files '/var/opt/OV/share/tmp/remspi/_Remspi'

for target '_Remspi'. Permision Denied (SPI215-45)

Well, the technical document (Doc ID emr_na-c00917549-1) says :

These error messages occur when the Remedy SPI process is running due to the fact that the process locks these files.

Run "ovstop -c RemedySPI", then run remspiconfig

This also pertains to the remspifilter command.

Sunday, September 21, 2014

Disable HTTPS agent tracing – ovtrccfg

Every weekend, my QA OM server will be stopped for a offline Oracle backup. This weekend, after the backup, OM never started. I tried to start and it wouldn't. On checking the log System.txt, I found these :

0: ERR: Mon Jul 28 06:01:16 2014: ovcd (23454/47264176451744): (ctrl-127) Could not write the ovcd pid to file: 'No space left on device'.
0: INF: Mon Jul 28 06:02:17 2014: ovc (23451/47221706264736): (ctrl-103) Could not start the Control daemon.
0: ERR: Mon Jul 28 06:02:19 2014: ovcd (25940/1103829312): (sec.cm.client-23) Error during CertificateClient initialization.
1: ERR: Mon Jul 28 06:02:19 2014: ovcd (25940/1103829312): (xpl-89) write(13)[19720B50] failed.
2: ERR: Mon Jul 28 06:02:19 2014: ovcd (25940/1103829312): (RTL-28) No space left on device
0: ERR: Mon Jul 28 06:02:19 2014: ovcd (25940/1103829312): (ctrl-63) Error initializing RPC server: Can not initialize RpcServer. error=(xpl-89) write(16)[19724D60] failed.
 (RTL-28) No space left on device.

/var/opt/OV was 100% full.

Tracing was enabled and the trace file was almost 13GB.

I deleted the trace file, disabled tracing and tried to start opcsv.

#>>/opt/OV/support/ovtrccfg -help (used for tracing)

#>> /opt/OV/support/ovtrcadm -hosts

The following are the allowed clients for TraceServer

S. No. Client Name or IP address
-------- -------------------------

This is used to list the nodes that are configured to trace.

#>>/opt/OV/support/ovtrcadm -srvconfig 

TraceServer is running

Application Components Categories Level Enabled
----------- ---------- ---------- -------------
com.hp.openview.OvDiscoveryServer.OvDiscoveryServer
org.apache.catalina.startup.Bootstrap
java.lang.Thread
com.hp.ov.svcdisc.SvcDiscClient

gives the list of what is being traced.

I used the following command to disable tracing.

ovtrccfg -off

This disabled the tracing and I was able to start OM using opcsv.

Wednesday, September 17, 2014

HP OM - Policy Administration

Policies determine the action to be taken as a result of events that are intercepted.
To be more clear, policies are like a set of rules based on which monitoring takes place.

Policies need to be assigned and distributed to managed nodes from the management server. This can be done either from command line or from admin GUI (version 9)[motif GUI in version 8].

opctemplate / ovpolicy are the commands used for policy management.

Assigning a policy:

A policy needs to be assigned to the node that it will be distributed to.
Assignment can be done either from the command line using opcnode command or from the GUI options.

opcnode -assign_pol pol_name=<> pol_type=<> [version=<>] [mode=<>] node_name=<>net_type=NETWORK_IP
OR
opcnode -assign_pol pol_name=<> pol_type=<> [version=<>] [mode=<>] group_name=<>

where pol_type can be TEMPLATE_GROUP
CONSOLE_TEMPLATE
OPCMSG_TEMPLATE
LOGFILE_TEMPLATE
MONITOR_TEMPLATE
SNMP_TEMPLATE
EC_TEMPLATE
SCHEDULE_TEMPLATE
Version is the version of the policy you want to assign.
mode is the way a policy is assigned to a node. This can be FIX (default)
MINOR_TO_LATEST
LATEST

Distributing a policy :

After assigning a policy, the policy can be distributed using the opcragt command

opcragt -distrib [ -policies ][ -templates ][ -instrum ][ -actions ][ -monitors ][ -commands ] [ -subagts ] [ -force | -purge ] [ -highprio ] [ -simulate ]

-force => The data is transferred even if it exists already on the node.
-purge => Instrumentation is removed from the node and deployed again.
-highprio => will ignore limitations set on the number of simultaneous deployments and immediately trigger the deployment to the specified node.
-simulate => only a simulation, the files/policies will not be distributed to the node.

Listing policies on the managed node :

opctemplate / ovpolicy can be used to list all the policies distributed to the managed node.
# opctemplate -list
* List installed policies for host 'localhost'.

Version Status
--------------------------------------------------------------------

CONFIGSETTINGS "OVO settings" enabled 1

LOGFILE "<******>log" enabled 0001.0006

mgrconf "OVO authorization" enabled 1

MONITOR "<******************>" enabled 0001.0000

OPCMSG "<******************>" enabled 0001.0004

SCHEDULE "<*********************>" enabled 0001.0004

SNMPTRAP "<*****************>" enabled 0001.0013

# ovpolicy -list

* List installed policies for host 'localhost'.

Version Status

--------------------------------------------------------------------

configsettings "OVO settings" enabled 1

le "<******>log" enabled 0001.0006

mgrconf "OVO authorization" enabled 1

monitor "<************>" enabled 0001.0004

msgi "<******************>" enabled 0001.0004

sched "<*********************>" enabled 0001.0004

trapi "<*****************>" enabled 0001.0013

Remote listing policies on the mgd node from the management server :

# ovpolicy -list -host <******>
* List installed policies for host '<******>'.

Version Status
--------------------------------------------------------------------
configsettings "OVO settings" enabled 1
le "<LOGFILE POLICY NAME>" enabled 0001.0000

Removing policies on the managed node :

On the managed node :

ovpolicy -remove -all

Describing a policy on the node :

Policies are stored on the managed node in the location /var/opt/OV/datafiles/policies under the directories for specific type of policy.

# ls -lrt
total 28
drwxrwxr-x 2 root root 4096 Jul 16 19:06 configsettings
drwxrwxr-x 2 root root 4096 Aug 11 15:25 le
drwxrwxr-x 2 root root 4096 Sep 11 11:59 msgi
drwxrwxr-x 2 root root 4096 Sep 12 17:42 monitor
drwxrwxr-x 2 root root 4096 Sep 12 17:58 trapi
drwxrwxr-x 2 root root 4096 Sep 12 17:58 sched

drwxrwxr-x 2 root root 4096 Sep 17 12:07 mgrconf

Each policy will have a data file and a header xml file. The data file is the one that is the policy.

[>>:/var/opt/OV/datafiles/policies/trapi]

# ls -lrt

total 88

-r--r----- 1 root root 9171 Jul 16 19:06 0<***>0_data

-r--r----- 1 root root 11051 Jul 16 19:06 7<***>0_data

-r--r----- 1 root root 29541 Jul 16 19:06 b<***>0_data

-r--r----- 1 root root 1281 Aug 11 15:25 c<***>0_data

-r--r----- 1 root root 3333 Aug 11 18:12 7<***>0_header.xml

-r--r----- 1 root root 3331 Aug 11 18:13 b<***>0_header.xml

-r--r----- 1 root root 3326 Aug 11 18:13 c<***>0_header.xml

-r--r----- 1 root root 3302 Aug 11 18:13 0<***>0_header.xml

-r--r----- 1 root root 3271 Sep 12 17:58 8<***>0_header.xml

-r--r----- 1 root root 4450 Sep 12 17:58 8<***>0_data

Doing a grep on the files for the policy name would yield the file name for that particular policy.
# grep "<NAME OF POLICY>" *
c<***>0_data:SNMP "<NAME OF POLICY>"

c<***>0_header.xml: <name><NAME OF POLICY></name>

Thursday, January 19, 2012

Reset CoreID of Management Server - OMU v8

Lets jump a few step ahead. I wanted to share something that I learnt today.
Suppose the CoreID of the management server got reset or deleted by some reason, we would need to reset it to the value that was there previously.

To get the previous CoreID, we can run the following command either on the management server or on the management node.
On the management server :

[root@management_server:/root]
# ovcert -list
+---------------------------------------------------------+
| Keystore Content |
+---------------------------------------------------------+
| Certificates: |
| 0e86<******>5 (*) |
| 9109<******>d (*) |
+---------------------------------------------------------+
| Trusted Certificates: |
| CA_0e86<******>5 |
| CA_543b<******>a |
| CA_7ac7<******>5 |
| CA_d385<******>e |
+---------------------------------------------------------+

+---------------------------------------------------------+
| Keystore Content (OVRG: server) |
+---------------------------------------------------------+
| Certificates: |
| 0e86<******>5 (*) |
+---------------------------------------------------------+
| Trusted Certificates: |
| CA_0e86<******>5 (*) |
| CA_543b<******>a |
| CA_7ac7<******>5 |
| CA_d385<******>e |
+---------------------------------------------------------+

On the Managed node

[root@managed_node:/root]
# ovcert -list
+---------------------------------------------------------+
| Keystore Content |
+---------------------------------------------------------+
| Certificates: |
| beb5<******>1 (*) |
+---------------------------------------------------------+
| Trusted Certificates: |
| CA_0e86<******>5 |
| CA_543b<******>a |
| CA_7ac7<******>5 |
| CA_d385<******>e |
+---------------------------------------------------------+

Usually, if there is one management server in the environment, there would be only certificate listed under the 'Trusted Certificates' list.
Note : On the management server, under 'Trusted Certificates' the certificate listed with (*) at the end is the CoreID of that particular server.
Similarly on the managed node, the certificate marked with (*) is the CoreID of that particular managed node.

Since I have got multiple management server in my environment, there are a whole lot listed here.

So, lets proceed with the output from the managed node.
"0e86<******>5" is the CoreId of this managed node's primary management server.

To reset your management server's CoreID, go to the management server and

[root@management_server:/root]

#ovcoreid -set < Certificate ID > -force

[root@management_server:/root]

#ovcoreid -set 0e86<******>5 -force

Another Procedure would be to use the config file to set the CoreID.

[root@management_server:/root]

#ovconfchg -ns sec.core -set CORE_ID 0e86<******>5