Revision 424 – Dump Current Detected VMs

In revision 424, similar to 423 wherein I added the ability to dump out detected ESXs, I added the ability to dump out the virtualserver table as vm.csv or vm.xml. The outputs are self-documenting in #comments in the CSV, or as XML self-documents already. Note that the XML has no whitespace, so “xmllint -format vm.xml” is your friend.

For example:

(typically running locally: the MySQL is configured to only accept local connections)

java -jar vict.jar -D vm.csv

or

java -jar vict.jar -D vm.xml

…and in long-options too:

java -jar vict.jar --dump-config vm.xml

…but since the JRE is typically hidden below the VirtualWisdom directory, use vict.bat:

VICT.BAT -D vm.xml

How to Re-Use VirtualWisdom Data in 3rd-Party Tools

Everyone I’ve personally met who has witnessed the detail of VirtualWisdom metrics tends to be first amazed, then relates it to LAN and ethernet tools, then questions why we haven’t seen this before. The next question in very large organizations is “how can we re-use this data in our [insert home-grown tool here] ?”

Incorporating VirtualWisdom into an organization has various points of “friction”: training on a new tool, understanding the metrics, collection of data to help VirtualWisdom correlate, and beginning to use it internally. As a Virtual Instruments Field Application Engineer (AE or FAE), I tend to see the initial friction (collection of data, such as nicknames, or grouping Business-Units as UDCs. The less common friction is “OK, we love VirtualWisdom, but our expansive storage team want to exploit the metrics in our custom home-grown planning tools”.

Converting VirtualWisdom into basic data-collector ignores the reporting, recording, and alerting capabilities it offers; re-using its data in multiple entities of a corporation is an expansion on VirtualWisdom’s utility, and I’m more than happy to help a customer do that. The more we can help our customers make an informed decision — including leveraging more data in their own tools — the more we can help our customers “free their data” and improve the performance and reliability of a complex data-storage environment.

My entries here on the Virtual Instruments Bast Practices blog tend to be of a how-to nature; in this article, I’d like to show how the opensource tool “MailDropQueue” can help push VirtualWisdom data into your home-grown toolset.

There was a time that customers tried to find our reports by digging through the “exports” directory after a new report was produced — because the most recent report.csv.zip is the correct one, right? This ran into problems when users generated reports just before scheduled reports, and when the scheduled “searcher” went searching, the wrong report would be found. Additionally, some reports took a long time, and would not always be finished by the time the customer’s scripts went searching. Customers typically knew what script they could run to consume the data and push it to their own systems, but the issues in finding that file (moreso due to a lack of shell on Windows) caused this solution to become over-complex.

Replication at the database level gives us the same problem: the data is in a schema, and it’s difficult to make sense of without the reporting engine.

A while ago, VirtualWisdom gained the ability to serialize colliding reports: if a user asks for a report at the same time the system or another user is generating a report, the requests get serialized. This allows VirtualWisdom to avoid deadlock/livelock situations at the risk of the delay we’re used to at printers: your two-page TPS Reports are waiting behind a 4000 page print of the History of the World, Part 1A. The benefit of a consistently-responsive VirtualWisdom platform are well worth this benefit. Unfortunately, the API that many users ask for poses this same risk: adding a parallel load onto VirtualWisdom that needs an immediate response, adding delay in both responses and risking concurrency delays at the underlying datastore.

The asynchronous approach — wherein VirtualWisdom can generate data to share through its reporting engine — is more cooperative to VirtualWisdom’s responsiveness, but returns us to the issue of “how do I find that report on the filesystem?  The correct report?”

MailDropQueue is a tool in the traditional nature of UNIX: small things that do specific jobs. UNIX was flush with small tools such as sed, awk, lpr, wc, nohup, nice, cut, etc that could be streamed to achieve complex tasks. In a similar way, MailDropQueue receives an email, strips off the attachment, and for messages matching certain criteria, executes actions for each.

It’s possible for VirtualWisdom to generate “the right data” (blue section, above), send it to MailDropQueue (red portion, above), and have MailDropQueue execute the action on that attachment (green part above).  In our example, let’s consider where a customer knows what they want to do with a CSV file; suppose they have a script such as:

@echo off
call DATABASE-IMPORT.BAT TheData.CSV

The actual magic in this script isn’t as important as the fact that we can indeed trigger it for every attachment we see to a certain destination. Now all we need is to make a destination trigger this script (ie the green portion of the diagram above):


<?xml version='1.0' ?>
<actions>

  <trigger name="all">
    <condition type="true"/>
    <action>IMPORT</action>
  </trigger>

  <script id="IMPORT" name="import" script="DATABASE-IMPORT.BAT" parameters="$attachmentname"/>

</actions>

From the above, the “condition type=true” stands out, but it is possible to constrain this once we know it works, such as to trigger that specific script only when the recipient email matches “ftp@example.com”:


  <condition type="equal">
    <recipient/>
    <value>ftp@example.com</value>
  </condition>

Also, it’s not so obvious, but the result of every received email that matches the condition (“true”) is to run the script with the attachment as the first parameter. This means that if an email arrives with an attachment “performance.csv.zip”, MailDropQueue would Runtime.exec("DATABASE-IMPORT.BAT performance.csv.zip").

For reference, I’m running this on a host called fakemta.example.com, compiled to use the default port (8463) as:

java -jar maildropqueue.jar -c maildropqueue.xml

Where maildropqueue.jar is compiled by defaults (./configure && make) from a “git clone”, and maildropqueue.xml contains the configuration above. There’s a downloadable

Finally, We need to configure VirtualWisdom to generate and send this data attached to an email; this is a fairly simple problem for any VirtualWisdom administrator to do.  Following is a walk-thru up to confirming that the content is being generated and sent to the MailDropQueue; the composition of the report and the handler script “IMPORT-DATABASE.BAT” is too environmentally-specific to cover in this article.

  1. Create the report (outside the scope of this article) — confirm that it produces output. The following snapshot uses our internal demo-database, not actual customer data:

    Capacity / Performance Statistical Report

  2. Create a Schedule to regularly generate and send it:
    1. In the Report Generation Configuration, check that you have the hourly summary if so desired:
    2. Check that all probes are used, but you don’t need to keep the report very long:
    3. Confirm that you have file-format set to CSV unless your handler script can dismantle XLS, or you intend to publis a PDF:
    4. Choose to send email; this is the key part. The message can include anything as subject and body, but you must check “E-mail report as attachment”:
    5. …and finally: you may not yet have the distribution list set up, consider the following example. Note that the port number is 8025, and server is localhost because in this test, I’m running the MailDropQueue on the same server. The sender and recipient don’t matter unless you later determine which actions to run bases on triggers matching sender or recipient:
    6. Check that your MailDropQueue is running on the same port (this is an example running MailDropQueue using VirtualWisdom’s enclosed Java and the config example above; the two “non-body MIME skipped:” messages are from clicking “Send Test E-Mail” twice):
  3. Finally, run your MailDropQueue. The skip used above is shown here (except that running it requires removing the “-V”, highlighted), as well as the config, and an output of “java -jar maildropqueue.jar -V” to show how MailDropQueue parsed the configfile:
  4. Clicking “Run Now” on the Scheduled Action for the report generation shows end-to-end that VirtualWisdom can generate a report, send it to MailDropQueue, and cause a script to be triggered on reception. Of course, if the script configured into MailDropQueue hasn’t been written, a Java error will result, as shown:
  5. Now the only things left to do are:
    1. Write the report so that the correct data is sent as tables and Statistical Summary reports (only one View per section)
    2. Write the IMPORT-DATABASE.BAT so that it reacts correctly to the zipped archive of CSV files

Merging UDCs in VirtualWisdom to Join Manual and Generated UDCs

UDCs — User-Defined Context — can be very useful for showing the actual use, or membership of a device on the SAN, or for assigning a priority for alerting and thresholds. Often, these are hand-generated, but we do have methods of creating them from other content.

One customer has both: a UDC generated/converted from other content, and some manually-assigned content. Merging these would help him to assign filters and alerts as a single group, but the effort to merge it was looking excessive.

As you may recall, my content on Virtual Instruments Best Practices blog tend to be the how-to variety, and in this article, I’d like to share how to merge two UDCs programmatically, which can then be scripted in any automated collection scripts or tools you’re already using.

Merge, Then Cleanup

The general process we use for this is to merge the content first (using xmllint), then clean it up (using xsltproc) so that it’s back to sane, predictable UDC that is ready for routinely-scheduled import:

UDCs merged using xmllint and xsltproc

Notice in this image that the only things that change are in the upper box (“UDC Files”), which can be either manually-edited or autonomously-generated by filter or transform. As well, the result is a standard UDC from which we can generate filters or otherwise edit using XML tools.

As you can see, the tools used here are fairly standard; the only real development are the smaller scripts for each tool. UDCs are simply XML, and as such, quite easy to manipulate using standard XML tools.

Let’s break this down into the multiple steps.

Concatenation

The easiest way I found to concatenate two XML files was to use XInclude with an XPointer statement:

<?xml version="1.0"?>
<list xmlns:xi="http://www.w3.org/2001/XInclude">
    <xi:include href="File1.udc" xpointer="xpointer(//list/*)"/>
    <xi:include href="File2.udc" xpointer="xpointer(//list/*)"/>
</list>

In parts, this is really duplicates of the following:

<xi:include
href="File1.udc"
xpointer="xpointer(//list/*)"
/>

If you’ve written any sourcecode, you’d recognize an #include statement, or an import com.example.*; this is really no different: the document referenced by the “href” (File1.udc) replaces this xi:include statement. The second part, an xpointer="...", further clarifies the import by indicating that only a part of the document we include should come in — in this case, child elements of the “list” element. If you look at a UDC File, you’ll see that “list” is the root node; if that statement makes very little sense, then think of this as “we’re including all the stuff inside the outermost container, but not the container itself”. And hey, look again at the full file above: we specify a <list> and </list> around the inclusions. Coincidence? Not at all; this is a method of avoiding having two outermost root nodes, which cannot be further altered using XML because XML can only have one outermost root node.

…and it’s easier this way: don’t filter out what you can avoid including in the first case. It’s possible that there’s a better set of inclusion elements here, but this works well enough.

If we had three UDCs to merge, you can see that it would merely require another xi:include statement.

To act on this file, we execute xmllint using the “-xinclude” parameter (normal hyphen, only one “-”, not two) as follows. Note that xmllint is available on most non-Windows systems, and should be easily acquired using Microsoft Services for UNIX for a Windows system.

xmllint.exe -xinclude concatenate-UDC.xml > Merged.udc

for Windows, or for non-windows:

xmllint -xinclude concatenate-UDC.xml > Merged.udc

(using “Merged.udc” as a temporary file)

We now have a UDC file with only one outermost “list” element or root element, but it has a few problems:

  1. Every new UDC starts with Evaluation Order of 1; this is reflected in the UDC, and has to be fixed
  2. Only one default item should be given: we choose the one from the first file
  3. We only copy the first file’s definition of the UDC (Metric, set, etc) so the user needs to avoid doing this on two UDCs of different metric/set (illogical UDCs will result)

The first two issues can be fixed in the next step.

Clean the Concatenated Result

XSLT, or XSL Transformations, uses XSL (Extensible Stylesheet Language) to transform XML into a different XML, or even into a simpler form such as straight text or ambiguous markups such as CSV. In general, XSLT can map XML data from one schema to another, convert data from one schema to another, or simply extract elements of data into a text stream.

In our case, we’re using it to remove the redundant parts that will cause VirtualWisdom’s UDC parser to reject the document. There is currently no schema definition, so we have to make best efforts to make the resulting UDC look like one exported from VirtualWisdom.

The XSLT is a bit complex to post here, but it should be available by clicking on the marked-up filename below. Note that like xmllint, xsltproc is widely available as Linux and UNIX packages, or via the Microsoft Services for UNIX (currently a re-packaged Cygwin environment).

We execute the cleanup XSLT as follows:

xsltproc.exe concatenate-UDC.xsl Merged.udc > \VirtualWisdomData\UDCImport\Combined.udc

for Windows, or for non-windows:

xsltproc concatenate-UDC.xsl Merged.udc > \VirtualWisdomData\UDCImport\Combined.udc

Note here that the file we use is similarly-named, but end in “xsl”, not “xml”. Also, we write the file directly into the UDCImport directory of a VirtualWisdomData folder, which is where an import schedule would look for it.

This resulting file can be directly imported; an example import schedule is at the bottom of the Use UDCs to Collect Devices by Name Pattern article presented on May 1st, 2012. As well, because the UDC is in a standard form, it can be used to Quickly Create Filters for use in Dashboards, Reports, and Alarms.

Evolving SANs tend to have evolving naming schemes and assignment methods, so there will often be many different systems of identifiers that can be joined to work with different Business Units, customers, or functional groups; different such groups tend to cause different sources of information to be polled, and different formats to result. I hope this process help you to reduce the manually copying of data attributes which is so prone to human error and scheduling delays.

I hope this helps you to “set it and forget it” on more sources of data. Accurate data drives decisions: how can you methodically fix what you cannot measure and make sense of?

Use VirtualWisdom Alarms to Schedule Daily Tasks

The VirtualWisdom Service part of the VirtualWisdom Platform doesn’t necessarily do everything: our customers’ SANs differ in the small details as well as the larger ones, necessitating VI Services to help with some customization. In many cases, we set things up to run daily, such as grabbing zone info to convert to Nicknames, or converting Nicknames to UDCs and Filters.

In some cases, customers cannot edit the Windows Scheduler to run these, and do not have a UNIX-like system with an available scheduler. This can be due to access, or corporate policy. I wanted to share a workaround for this situation: (mis-)use the Alarm system to do so.

The following image may explain more efficiently than a walk-through:

Example daily alarm to run a batch file

As you can see by the name, the alarm policy should only be applied to one ProbeSW — to one SAN Switch.

The alarm will trigger when any data flows — you can see the trigger set to “> 0, 1 matching interval in domain of 1 interval”, and all it does it runs an external program. The configuration of that external program is also opened in the editor, and you can see that it simply runs a script (using full pathname).

The re-arm of that Alarm Policy Rule is “MB/sec != -1″. Because MB/sec can only go down to zero, “-1″ is impossible, so this rule will always match. The trick is that this has to match one triggered, and has to match for 288 intervals (288 x 5 minutes = 24 hours). Effectively, this is a logic statement that says “don’t run more often than every 24 hours”.

This Alarm Policy Rule effectively runs immediately after the Portal Service is restarted or the Alarm Policy is applied to a switch, and will run every 24 hours thereafter (understanding that 288 might need to be 287 to avoid a 5-minute skew daily).

The “meat” or complexity here would be in the BAT file: the Alarm uses the “External Script” action to run our batch file daily. This avoids configuring the OS Scheduler, but at a cost of not being able to choose the exact time. Additionally, the BAT file executes with the permissions of the Portal Server, which typically cannot view Network Shares and other remote resources.

Move Your VirtualWisdom Backups into Your Backed-Up Space

VirtualWisdom has an easy backup system: quite simple to configure for backups as easily as any scheduled event: as frequently as daily, at any time, and with multiple schedules possible, re-using the same configuration for each. The issue of a new filename every time — chosen by VirtualWisdom to avoid overwriting a good backup with one that might run into some exception and be incomplete — often causes a new backup file each week to be present, and no simple method of aging-out old backups.

The Post-Backup Script in the Backup Service Configuration runs after every backup, if activated: it simply executes a script with a few parameters. This allows the VirtualWisdom Administrator a certain flexibility in writing any manner of script that can run as the VirtualWisdom process to accomplish the automated moving around of backup files — or, logically, any task, even unrelated to the backup.

As defined by the underlying database vendor, our database files need to remain untouched by backup and antivirus processes which tend to lock the files for long periods. Any locked data file tends to block database writes, slowing throughput, and risking corruption of the data. This requirement also means that backups are typically outside of corporate backup tools and policies; the risk of a backup not being preserved in a catastrophic filesystem exception is clearly significant. Even though VirtualWisdom only handles measurements and data about the data, it does not handle data itself, and does not form a critical path in data I/O, loss of VirtualWisdom is loss of measurement and analysis tools which may be critical to resolve storage issues. Clearly we want the backup for VirtualWisdom to be safely archived.

In this article, I’d like to share one example of how successful backups can be moved into the filesystems covered by corporate backup policies, replacing past backups to avoid ever-increasing disk usage. My content here on the Virtual Instruments SAN Best Practices blog tends to be of a technical “how-to” nature; we hope this article may help define a customer’s backup config, giving safety to the data so that focus can return to the performance and availability of the SAN.

Overview

The basic backup process is a sequence such as:

  1. lock the database (database becomes read-only)
  2. quickly duplicate all database files
  3. unlock the database and let processing continue
  4. aggregate the backup files into a single file, optionally compressing

The feature we want to exploit to improve this process is the optional “Execute the following command upon completion” entry on a Backup Service Configuration to move the backup file to where it should be. In most cases, “where it should be” is a disk covered by corporate backup processes with sufficient space to hold the backup, compressed, accounting for organic growth (database backup grows as number of monitored ports, VMs, ESXs, and ITLs increase over time).

For our example, that is the “X” drive. Bear in mind that the backup script runs as the VirtualWisdom process, which runs as a service hence has no access to network drives. In our example, the “X” drive might even be a SAN LUN: even though we recommend that the disk not be on a SAN LUN due to the risk of being affected by the performance problems and exceptions that VirtualWisdom is trying to help users track and resolve, the backup may be on a SAN LUN because delays in the archived backup do not directly affect performance of the VirtualWisdom platform.

Example Backup Service Configuration

Typically, your backup schedule would look like the following: (except that my work server is small, so I have disabled mine by unchecking the checkbox beside the scheduled time)

Typical Backup Service Config without Post-backup script

… with a Backup Service Configuration such as:

Typical Backup Service Config without Post-backup script

Improved Backup Service Configuration

Instead of merely doing the backup, we can use the “post-backup script” to do the work for us. The “Post-Backup Script” is the name I’ve started using for the script that gets listed in the box for “Execute the following command upon completion”. An example script may be as simple as the following:

Example Post-Backup Script

As we can see, when the second parameter given to the script (“%2“) is a 1, then the filename given as the first parameter (“%1“) is moved to the consistent filename X:\Backups\VirtualWisdomBackup.zip. The X:\ drive would be within normal backup policy, so routine backups would protect the database archive.

This batch file is run by entering it as a “post-backup script” as follows. NOTE: where possible, use a full pathname to ensure the script is found, and it’s the correct script.

Example Backup Service Confg with a post-backup script

As we can see in this Backup Service Configuration, we have enabled the “Execute the following command upon completion” checkbox, and listed our script as the script to run. The two parameters are selectable with the “Insert” box, or may be directly typed free-form.

When the script runs after a backup is complete, the $BACKUP_STATUS$ is replaced by a 1 or a 0 depending whether the backup was successful — and as noted above, if this value is “1″, the working file is moved; otherwise, it’s untouched. Perhaps an enhancement might be to raise an alert that the backup failed (VirtualWisdom logs backup failures in the Portal log, but makes no other indication), or to delete or move aside a failed backup as well for analysis and fault-resolution.

When the backup is complete, and a new backup file is created named after the time that the backup started: backup - yyyy-mm-dd-hh-MM.zip, where yyyy is the year, mm is the month (zero-padded), dd is the day (zero-padded), HH is the hour (24-hour time), MM is the minutes (zero-padded) — yes, this is intentionally very close to ISO8601 that is the basis for RFC3339, HTML5, and XML date format. With a new pseudo-random always-incrementing filename, new backups will never overwrite previous backups, but they are difficult to track down. The $BACKUP_FILE$ token is replaced by this filename, allowing the post-backup script to work with the correct filename every time.

Of course, in order to summarize the underlying behaviour, we do change the name of the schedule itself, but it’s not critical:

Backup Configuration with post-backup script

In most articles, we include complete examples, but the development and explanation of this relatively simple example is a complete example. Of course, changes will have to be made for each individual unique environment. Most backups do not run to the C:\ drive because there would not be sufficient space; rather, most configurations have a D:\ drive or E:\ drive for data, and that drive is used as a working drive during backups.

Quickly Create Filters for VirtualWisdom UDC Values

The UDC capability in VirtualWisdom enables quite a powerful ability to group fabric entities based on a number of parameters, but creating the filters to use a large UDCs can be a bit cumbersome. UDC is VirtualWisdom’s User-Defined Context, allowing a virtual metric value to be defined within summaries, calculated based on powerful expressions.

Typically, UDCs are used to separate and group entities such as:

  • Physical Datacenter to filter physical-layer alerts (such as CRCs) to the correct ticket queue for inspection
  • Business Unit (BU) UDCs to filter performance alerts (such as response-time) against Business-Unit -specific thresholds (i.e. Oracle requires 12ms response time, but the NFS filer accepts 20ms)
  • Port/Blade/ASIC calculations
  • Grouping a SuperDome’s ports or an Array’s ports for filtered reports

As well, UDCs are used for “what-if” calculations: What if the SCSI traffic from a certain HBA was zoned to a different storage port, which it overload the Queue and link speed? What-if UDCs are an extremely powerful tool to prove capacity based on historical use, but somewhat out-of-scope for this article.

My content in Virtual Instruments’ SAN Best Practices tend to be of the how-to nature; in this article, I’d like to share a simple method of creating all the “X = Y” filters for a specific UDC programmatically, which can reduce the time-to-value in new installs or changing environments. When linked with other generation how-to articles (such as nickname collection, or generating UDC by transform), this can further reduce the effort of managing a very large SAN.

Process Overview

For this process, our workflow will look like the following:

As you can see, the starting file “UDCExport.udc” can be either exported from the VirtualWisdom Portal itself, or can be generated by other means. The file is converted using xsltproc using a “program” or “script” UDC2Filter.xsl, resulting in Filters.xml which can be imported manually to VirtualWisdom.

Overview

UDC Files in VirtualWisdom are a specific schema of XML file; as such, standard easily-available license-free tools such as xpathget, xmllint, or xsltproc can be used to interrogate, validate, or convert the starting XML to a different format, even generating CSV or simple text in the process.

XSLT is the XML Stylesheet Translations; XSL is a Stylesheet for XML, similar to CSS describing the stype of a free-form HTML page. In essence, XSL can be considered an CSS in XML, but rather than markup content — such as type facing and style for large printed content — XSL can also transform and convert content. XSLT is the act of using XSL markup in a standalone processor (xsltproc) to create content based on XML content. In many cases, this is XML generating XML, but can be used to write TSV, CSV, JSON, etc.

VirtualWisdom Filters are exported as another schema of XML file, and can be similarly manipulated by standard XML tools. Even though this XML is a text-based format, trying to edit it with a text editor can be prone to human-error. We can read XML for debugging (xmllint -format), but as the size of the content gets larger, to use it as thought XML is an opaque binary format, which again leads us to the free tool “XSLT”.

In our case, a specific XSLT file is used to manipulate a UDC definition into a list of Filter definitions: UDC2Filter.xsl guides the conversion of UDC Values to Filters which match them.

Running the Script

xsltproc is available on most non-Windows platforms as an installable RPM, SSO, .deb, .pkg, or similar pre-packaged open source project; on Windows, it can be installed per SageHill’s Instructions; a file xsltproc.zip is easily obtained from any VI FAE to accelerate your install process.

Running it is quite simple:

xsltproc.exe -o Filters.xml UDC2Filter.xsl UDCExport.udc

There’s no output: all generated content goes directly to the output filter file.

Complete Example

In order to show how the full process, in case I’ve left out some details or some details seem implied, this is a full example based on data in our demo databases (which we use for demos and training):

Given the following UDC:

UDC that we start with for our example UDC2Filter

We export this UDC to Application_SW.udc, run the XSL Transform as follows:

xsltproc.exe -o Application_SW_Filters.xml UDC2Filter.xsl Application_SW.udc

The result we get in Application_SW_Filters.xml looks like this:

Clearly this example is only a few filters, no big deal. The benefit comes in when there are more than a half-dozen to build (recently, a 212-value UDC was tested). As well, if the UDC is edited (perhaps based on automated processes) then the administrator must go through and check that every value has a filter.

Unfortunately, there is no schedule-action for Filter import.

Use LUN Nicknames in VirtualWisdom to Identify VLUN SymmDevice Names

When some customers look at the output of our Hardware Probes such as the 8g FC8, and they’re confirming that all their Oracle transactions are meeting the SLA of 8ms, the LUNs they see are a bit unusual. They’re used to seeing LUNs such as LUN32, LUN33, but VirtualWisdom shows them LUNs like 17652, and they just don’t match up. This hinders the utility of the data, and may reduce the confidence they have in the data itself. When data drives decisions, we need accuracy and we need to be as easy to use as we can based on what we have available on the FC link.

The truth of the mismatch is that the devices involved actually convert the LUN numbers: where Virtual Wisdom shows you “17652″, that’s actually the LUN “on-the-wire”. The actual LUN in the SCSI frame is that large number, but vendor-specific tools convert it to manageable, familiar numbers — which can make the actual LUN appear “wrong”. Although it’s very simple to say “VirtualWisdom isn’t making the same conversion as your vendor-specific tools”, we’d rather be more helpful. When you have a problem on your SAN, “VirtualWisdom doesn’t support…” offers no help towards fixing the problem. Rather than “17652″, we’d rather tell you “Symm Device 1E2B”, which — for a very important customer of ours — is a comfortable middle-ground in identifiers and terms for the LUNs.

So how can we get a usable label on those LUNs? How do we do the lookups for you so that in an emergency, you have the details you need, already dereferenced?

As you’ve seen in the articles I’ve posted, I’m a Field Application Engineer for Virtual Instruments, and making this conversion — plus automating it — are the challenges I enjoy working on and sharing. This how-to article shares how to get SymmDevice aliases mapped onto LUNs as nicknames in a scriptable method. Let’s dive in:

Process Overview

For this process, our flow will look like the following:

In this diagram, the “sysinq” command is used to generate the “sysinq.txt” file; that part of the process is quite dependent on the tools available on your servers, hence shown dotted.

Overview

We at Virtual Instruments have witnessed this odd mismatch in LUNs for some time, and initially we just said “Subtract 0×4000″, but that didn’t work precisely. In another instance, we found two servers with a very similar number, so the informal rule became “subtract 17506″, but we later found that this rule was unusable outside that customer. Recently, in very detailed discussions, a customer found the logic that gets them to their LUN Nicknames, as follows:

The “sysinfo” file looks like this: (mocked-up example from testcases)

disk 1071 8/0/12/1/0.140.0.54.6.1.4 sdisk CLAIMED DEVICE EMC SYMMETRIX
lunpath 1071 8/0/12/1/0.0x50060482d5123456.0x4392000000000000 eslpt CLAIMED LUN_PATH LUN path for disk6219
disk 1071 8/0/12/1/0.140.0.54.6.1.4 sdisk CLAIMED DEVICE EMC SYMMETRIX
disk 1071 8/0/12/1/0.140.0.54.6.1.4 sdisk CLAIMED DEVICE EMC SYMMETRIX
lunpath 1071 8/0/12/1/0.0x50060482d5123456.0x4392000000000000 eslpt CLAIMED LUN_PATH LUN path for disk6219
lunpath 1071 8/0/12/1/0.0x50060482d5123456.0x4392000000000000 online

The key part of this file is the line that says:

lunpath ...0x50060482d5123456.0x4392... disk6219

In this case, “50060482d5123456″ is the storage device’s WWN, and 0x4392 is the LUN (17298 in decimal) we detect as the actual LUN used in the SCSI exchange for disk6219.

The “syminq” file looks for a certain server like this: (again, mocked-up example from testcases)

/dev/rdisk/disk6219 R1 000190102037 5773 3701E8B000 2096640
/dev/rdsk/c23t1d2 R1 000190102037 5773 3701E8B000 2096640
/dev/rdsk/c43t1d2 R1 000190102037 5773 3701E8B000 2096640
/dev/rdsk/c133t14d5 R1 000190102037 5773 3701E8B000 2096640

The key part of this file is the line that says:

...disk6219 ...3701E8B...

That rdisk entry is as follows: (keep in mind, the content is replaced by bogus values: some common values may not align anymore)

  • /dev/rdisk/disk6219 is a local device node on the host
  • I’m not sure what “R1″ stands for
  • 000190102037 is the Symmetrix serial number
  • I’m not sure what “5773″ means (model number?)
  • 3701E8B008 breaks down as:
    • 37 — last two digits of serial 000190102037
    • 0 — not sure
    • 1E8B — the SymmDevice ID
    • 000 — Director port number
  • 2096640 (2^21 -512) means the LUN has capacity ~2TB

I don’t have official information, but it’s possible that 01E8B is actually “Symmetrix device 01E”, and “Director #8B”. 0x8b larger than 16 used to mean Processor B, but that seems a bit outdated now. The tools we use could easily give out 5-digit SymmDevice-XXXX values if we could confirm this numbering breakdown.

Essentially, the SysInfo2LUNNickname.awk script does the following:

  1. pre-loads a syminq file to provide “better” nicknames where possible
  2. parses the sysinfo to realize nicknames, substituting syminq results where possible
  3. Output the results as a new-style (VirtualWisdom v2.1 or later) nickname CSV

The resulting nicknames.csv has a series of lines such as:

"LUN","50060482d5123456","17296","disk6217"
"LUN","50060482d5123456","17297","disk6218"
"LUN","50060482d5123456","17298","SymmDevice-1E8B"
"LUN","50060482d5123456","17299","disk6220"

As you can see, where there is a “better” match, “SymmDevice-1E8B” is used; otherwise, “diskXXXX” is still there. At this particular customer, “1E8B” is an example of a common name for their LUNs rather than the 17298 reported by VirtualInstruments or the 0×4392 reported in hex.

For exceptionally large syminq files, because gawk.exe is loading up a large associative array, memory usage may temporarily peak; this has been tested on 500k – 1MB text files without visible adverse effects, but hidden memory usage such as interpreters’ associative arrays and implicit memory-management is something to be aware of.

Running the Script

gawk.exe -v SYMINQ=syminq.txt -f SysInfo2LUNNickname.awk sysinfo.txt > nicknames.csv

It’s that simple. In order to improve re-use, I collected this logic as the AWK script because I’ve never had portability problems with AWK except for the UNIX/Windows CR/CRLF/LF text line-endings debate.

The script is non-interactive, but prints the results, so the script is typically run with the output redirected to a file. Running it is very quiet, as you’d expect:

gawk execution screencap showing LUN Nickname creation

Complete Example

A complete example is difficult for this how-to because the files used are scattered on each server; collecting the sysinfo, and the syminq output from each server may be the more difficult part. My test example looks like the following (I always recommend using full pathnames for tools):

@echo off
cd \VirtualWisdomData\

REM following is all on one line but split here for easier reading
C:\UnxUtils\usr\local\wbin\gawk.exe
-v SYMINQ=\sandata\syminq01.txt
-f \sandata\SysInfo2LUNicknames.awk
\sandata\sysinfo > \VirtualWisdomData\DeviceNickname\nicknames.csv

Similar to previous Nickname-generation/derivation tutorials, this process generates a single nicknames.csv file. You could append this result to any other generated file for which you already have an import schedule, or create a new one. In order to be as complete as possible, I’ve included an import schedule example that should be surprisingly similar to the others (and similarly brief: the User Guide has more detail regarding Schedules)

  1. Views Application, Setup tab: Views Application, Setup Tab
  2. “Schedules” page, roughly 5th item down: Views Application, Setup Tab, Schedules page
  3. Create a new Schedule, with the action “Import WWN Nicknames”: (or, if you prefer, “Import LUN Nicknames”) Views Application, Setup Tab, Nickname Import Schedule
  4. …and configure it to use a new WWN Importing Configuration, as follows. NOTE we only use a local filename, all files are in the \DeviceNickname directory of your VirtualWisdomData folder:Nickname Import configuration, nicknames.csv

Use UDCs to Collect Devices by Name Pattern in VirtualWisdom

Virtually all SAN devices that are zoned for traffic have names (in fact, if you have nicknames/aliases in your zone files, then you can directly convert zone info to nicknames). VirtualWisdom’s filtering capabilities allow you to restrict a Dashboard, Report, or Alarm Policy Ruleset to a specific datacenter or business unit, but often creating those UDCs can be cumbersome.

A recent customer created UDCs with 192 values across three metric sets, allowing him to group data by specific servers, storage, and virtualizers automatically; this “how-to” is intended to show how you can do the same.

Process Overview

For this process, we need only a set of nicknames; the simpler old-format nickname file looks like:

500604825D2E2144,"DMX1911_FA3AA"
2100001B329FE31D,"Billing44_HBA0"
10000000C741ABCD,"4241_7b1"

(Notice: WWN first, no spaces, optional quotes for safety)

Generate UDC Values from Nickname Pattern

Our flow for this process or pipeline looks like the following diagram:
Flow of a UDC generated by pattern from Nicknames

The tool we use here is “awk”, or “awk.exe”, or “gawk.exe”; in Solaris, look for “nawk”. It’s on virtually every non-Windows system, Microsoft has a version in its tools for UNIX, or Google may help you find a copy. As well, UnxUtils has a version.

Awk is an interpreter, so needs a script or program, and for that, we use TransformUDC.awk which takes the following parameters:

parameter Meaning
COL What column in the CSV input is the Nickname? (default: 1)
NAMEFCX_LINK Name of the ProbeFCX::Link UDC
NAMEFCX_SCSI Name of the ProbeFCX::SCSI UDC
NAMEFCX_SCSIINIT Name of the ProbeFCX::SCSI UDC, matching Initiators only
NAMEFCX_SCSITARG Name of the ProbeFCX::SCSI UDC, matching Targets only
NAMESW Name of the ProbeSW::Link UDC (default: Transformed_UDC)
TRANSFORM Transform (basically the ‘s/x/y/g’ in a “sed -e ‘s/x/y/g’” command) (default: remove last two _sect_sect: DC_Serv1_fcs0_SW12P121 –> DC_Serv1)
UDCDEFAULT Default value for UDCs (default: Unknown)

A simple command such as the following will generate our results:

gawk.exe -f TransformUDC.awk Nicknames.csv

In our nickname file, the older format (first and second variant up to VW-3.1) gave us the nickname as the second parameter, so let’s tell the script that the nickname is in column #2:

gawk.exe -v COL=2 -f TransformUDC.awk Nicknames.csv

The problem is: how do we want the UDC values defined? If you’ve used “sed” or “awk” before, there’s a basic replacement term that looks like s/dog/cat/g or gsub("dog","cat",$0) … this part really depends on your nickname format, but looking above, we have nicknames that look like:

500604825D2E2144,"DMX1911_FA3AA"
500604825D2E2145,"DMX1911_FA4AA"
500604825D2E7744,"DMX1927_FA3AA"
500604825D2E7745,"DMX1927_FA4AA"
500604825D2E7747,"DMX1927_FA6AA"
500604825D2E7748,"DMX1927_FA7AA"
500604825D2E774C,"DMX1927_FA13AA"
500604825D2E774D,"DMX1927_FA14AA"
2100001B329FE35D,"Billing43_HBA0"
2100001B329FE35E,"Billing43_HBA1"
2100001B329FE31D,"Billing44_HBA0"
2100001B329FE31E,"Billing44_HBA1"
10000000C741ABCD,"4241_7b1"

We see how, in this example, chopping off everything after the “_” gives names such as “Billing43″ and “DMX1927″. In see and awk, we would write: s/_.*$//g so we’ll use that as our transform. How can we test this?

Trim Quotation Marks

We could trim off the quotation marks around the second field using this: (“,” as field-separator, convert (“) to (), an empty replacement)

gawk -F, '{gsub("\"","",$2); print; }' Nicknames.csv

… unfortunately, we need to use quotation marks for the script, and then we need a bunch of “\” escape sequences, so it looks much more complex running it:

gawk.exe -F, "{gsub(\"\\\"\",\"\",$2); print; }" Nicknames.csv

… which looks like:

awk Transforms: Trim Quotations

Truncate All After “_”

Based on the example above, we can now test whether our transform (“s/._*$//g”, or gsub(“._*$”,””,…) ) gives us the results we want, such as (notice: “print $2″, so we’ll only see the second field):

gawk.exe -F, "{gsub(\"\\\"\",\"\",$2); gsub(\"_.*$\",\"\",$2); print $2; }" Nicknames.csv

… which looks like:
awk Transforms: Trim Quotations, Trim Nickname

This means our transform works, so let’s use it in the script:

gawk.exe -v TRANSFORM="s/_.*$//g" -v COL=2 -f TransformUDC.awk Nicknames.csv

Unfortunately, “4241″ is not worthwhile to us because it’s only one matching name, so let’s trim that one off by saying “minimum of 2 matching names per UDC value”:

gawk.exe -v MIN=2 -v TRANSFORM="s/_.*$//g" -v COL=2 -f TransformUDC.awk Nicknames.csv

Finally, What do we want to call the UDC? The tool always generates a ProbeSW::Link UDC, and if unnamed, defaults to “Transformed_UDC”. The Name of the UDC is limited to 32 characters, and values themselves to 24 characters; the name of the UDC becomes the name of the “metric” or context that we are generating. Suppose while working at XYZ Cheese and Dairy Distributors, we want a UDC called xyz-SW-BizUnit (we need to use “_” rather than “-”):

gawk.exe -v NAMESW="xyz_SW_BizUnit" -v MIN=2 -v TRANSFORM="s/_.*$//g" -v COL=2 -f TransformUDC.awk Nicknames.csv

Let’s run this with a redirection (“>”) to store the results to a file:

gawk.exe -v NAMESW="xyz_SW_BizUnit" -v MIN=2 -v TRANSFORM="s/_.*$//g" -v COL=2 -f TransformUDC.awk Nicknames.csv > \VirtualWisdomData\UDCImport\xyz-UDCs.udc

Running this as a command in cmd.exe is relatively quiet because this is a non-interactive command. It tends to look like the following:

gawk -f TransformUDC Nicknames.csv to xyz-UDC.udc

Importing this example, we see the following (you’ll note: the default has also been set using “-v UDCDEFAULT=Other”) :

Generated UDC values as viewed in VW Views

Schedule UDC Import

Creating the schedule is relatively straight-forward: although there is some strong guidance in the VirtualWisdom User Guide, a complete example would like like the following:

  1. Views Application, Setup tab: Views Application, Setup Tab
  2. “Schedules” page, roughly 5th item down: Views Application, Setup Tab, Schedules page
  3. Create a new Schedule, with the action “Import UDC configurations”: Import xyz-UDCs Schedule in Views
  4. …and configure it to use a new UDC Importing Configuration, as follows. NOTE we only use a local filename, all files are in the \UDCImport\ directory of your VirtualWisdomData folder:Vies Application, Setup Tab, UDC Import Configuration (shortened)

The benefit here is that the UDC always replaces existing values without prompting. As well, after import, a UDC re-calculates values for both past summaries and new summaries. This allows you to “fix history” if your UDC is not quite correct the first time.

Convert Zone Info to Nicknames for VirtualWisdom

VirtualWisdom uses “Nicknames” or “Aliases” to give human-readable names to attached SAN devices, reducing the time to locate a problem device, but also to help group devices logically as being the same server or storage, and into business units for SLAs and escalation of issues.

We know that Nickname management can be a hassle, but the obvious gains make it worthwhile, so some of our work in Services is helping customers draw this data from existing repositories such as fabric Zoning information. Maintaining aliases in your zones, then converting those to nicknames, means that you only need to maintain one repository of names.

This “How-to” article is targeted at showing how to do this in common environments. As a VI Application Engineer, my content on this feed tends to be more of a lower-level “how to” in nature. This content has been in our internal self-help content, but may be difficult to find.

Collection, then Conversion

Diagram of data-flow of nicknames from switches to VirtualWisdom

The general process tends to be collecting the data, then converting to a compatible format for import. Let’s focus first on Collection, which tends to be a script running at scheduled times during the day or week.

Scripting under Schedule

This tends to be done as a batch file that is triggered through a scheduler such as the Windows Scheduler running a BAT file, or a UNIX-like OS running a shell script from cron or as a passive check under tools such as Icinga.

Where fabric-wide data is used, only one switch per fabric needs be queried. I tend to use the least-busy switch to avoid adding any load to Core switches or other busy switches.

Will DBTools Work for You?
The easiest method if you have small switches is to use \Program FilesVirtual Instruments\VirtualWisdom\UnSupported\DBTools\DBTools.exe tool, but running it as a batch command. This tool will connect to the switch using SSH, query the information, and convert it to the right format for import. In essence, the collection and conversion is a single step. For example, using the example username “scott”, password “tiger”, switch IP 192.168.0.1, to a file FabricA.csv in the import directory:

Brocade “alishow”: (the command is all on one line)
DBToolScript.bat -n -st brocade -u scott -p tiger -ip 192.168.0.1 D:\VirtualWisdomData\DeviceNickname\FabricA.csv

Cisco “fcalias”:
DBToolScript.bat -n -st cisco -u scott -p tiger -ip 192.168.0.1 D:\VirtualWisdomData\DeviceNickname\FabricA.csv

For example:

screen cap of the DBToolScript -n run
(In this example, my demo server has the database on the C: drive; this is not the recommended config for production servers! Also, notice how the DBToolScript cannot open a log file — running this command in your VirtualWisdomData directory will allow it to write a log to .\Log\DBToolLog\ )

Putting commands such as this into a batch file running daily via Windows Scheduler, and configuring a scheduled Import via the VirtualWisdom Scheduler, your job is complete!

The DBTools command doesn’t understand all possible nickname sources, and may have problems on some switches; if this method doesn’t work for you, then we resort to the two-stage process. This less-polished method is a bit more versatile, but isn’t as pretty. Once configured, though, it tends to work reliably.

Collection

The more manual collection can be done from four different sources:

  1. Brocade Switch using “zoneshow”
  2. Brocade Switch using “alishow”
  3. Cisco Switch using “show device-alias database”
  4. Cisco Switch using “show fcalias”

Collection requires non-interactive SSH tools such as plink.exe available from the makers of Putty; google should help you find it, but if you cannot, VI can help redirect you. The general command is:

plink.exe -l username -pw password IP.IP.IP.IP "command" > intermediate.file

For example: (using scott, tiger, 192.168.0.1, and a Brocade/zoneshow switch)

plink.exe -l scott -pw tiger 192.168.0.1 zoneshow > sw-192.168.0.1.zone

… and a Cisco/”show device-alias database” at 192.168.0.3:

plink.exe -l scott -pw tiger 192.168.0.3 "show device-alias database" > sw-192.168.0.3.cisco

These commands give no output when they run, except the first time: the plink.exe command wants you to accept a key to later ensure you are not vulnerable to a man-in-the-middle attack, which looks like the following: (accept the key once, later you won’t be asked unless it changes)

Again, you only need to collect zoning information or fabric-wide alias information from one switch per zone. With a unique filename per fabric, you’re ready to convert these files.

Conversion

Brocade and Cisco tend to use consistent formats for their outputs, but they are in text format. Most times, the two scripts work for this. These are scripts for the “awk” tool, which can be extracted from the UnxUtils project, or using Microsoft’s tools for UNIX. Either method gives you a “awk.exe” or a “gawk.exe”, which will execute these scripts:

  1. brocade-alishow2wwncsv.awk
  2. cisco-devicealias2wwncsv.awk

Conversion of a Brocade zonecfg or alishow is done as:

gawk.exe -f brocade-alishow2wwncsv.awk sw-192.168.0.1.zone > D:\VirtualWisdomData\DeviceNickname\FabricA.csv

Whereas conversion of a Cisco device-alias database or fcalias is done as:

gawk.exe -f cisco-devicealias2wwncsv.awk sw-192.168.0.3.cisco > D:\VirtualWisdomData\DeviceNickname\FabricB.csv

Note: these runs are redirecting output to files, so these commands give no visible output to the cmd.exe screen except in the case of errors.

Complete Example

A complete example of collection and conversion may look like the following code. Be aware, we tend to recommend using full pathnames (i.e. C:\Program Files\something\else\plink.exe) to ensure the commands are found regardless %PATH% variable and working directory. This example is simplified to be more readable but does run as-is given the right environment and working directory.

@echo off

plink.exe -l scott -pw tiger 192.168.0.1 "show device-alias database" > sw-192.168.0.1.cisco
plink.exe -l scott -pw tiger 192.168.0.2 "show fcalias" > sw-192.168.0.2.cisco
plink.exe -l scott -pw tiger 192.168.0.3 "zoneshow" > sw-192.168.0.3.zone
plink.exe -l scott -pw tiger 192.168.0.4 "alishow" > sw-192.168.0.4.zone

gawk.exe -f brocade-alishow2wwncsv.awk sw-192.168.0.3.zone sw-192.168.0.4.zone > nicknames.csv
gawk.exe -f cisco-devicealias2wwncsv.awk sw-192.168.0.1.cisco sw-192.168.0.2.cisco >> nicknames.csv

This example generates a single file; the configuration to import this one file is as follows (briefly shown here because the User Guide has more detail regarding Schedules):

  1. Views Application, Setup tab: Views Application, Setup Tab
  2. “Schedules” page, roughly 5th item down: Views Application, Setup Tab, Schedules page
  3. Create a new Schedule, with the action “Import WWN Nicknames”: Views Application, Setup Tab, Nickname Import Schedule
  4. …and configure it to use a new WWN Importing Configuration, as follows. NOTE we only use a local filename, all files are in the \DeviceNickname directory of your VirtualWisdomData folder:Nickname Import configuration, nicknames.csv