Revision 516 – Added parsing of OnCommand Data for Nicknames

In revision 516, the basic capability to ask an OnCommand/NetApp management console for the Nickname/WWPN mapping has been added. This allows the reuse of the information it has collected by various device-specific methods to add meaning and facility to VirtualWisdom.

The ability to avoid re-polling and re-querying data essentially reduces the load to the devices polled and reduces the management effort: re-using the effort already expressed to configure one tool means that no additional effort is required — so long as the information is usable.

In this case, the same parser logic is used as was added for BNA in BNA Query later expanded to both “pools” in BNA. Like the BNA work, the OnCommand query simply reformats a query and sends it through the array of parsers to vote upon:

java -jar vict.jar --nickname=osmsql://user:pass@server:port/ --nicknameout=\VirtualWisdomData\DeviceNickname\nicknames.csv

Default user/pass are not accurate, so that still needs to be resolved. Like the BNA parser, this method hits the underlying database directly, so it needs (firewalls/filters) direct access to the server, and is vulnerable to schema changes.

Revision 515 – Added Vote02-ZoneSplit2AndOther for Zone-Voting Algorithm

Many customers choose to give each attached device (Server, Tape, Storage) a nickname or alias, and create their zones that way. Some customers choose to nickname the individual connections (FA or HBA ports) uniquely within the individual components of a fabric. These are very simple to pull out of a copy of the zoning information — whether it be via text files from switches dumping their zoning/VSANs, or reading the zone info replicated in BNA. Both of these methods are handled by the “nickname” option to “vict”.

Others choose to store the nicknames or aliases of the various fabric members in the names of the zones themselves; for example, the zone that allows a server “Oracle44”, via Host Bus Adapter “HBA0”, to communicate with “VMAX12” via Front-end Adapter “10aA”, that zone might be called “Oracle44_HBA0_VMAX12_10aA_prod_A” — the additional “prod_A” is a commonly used method of adding text/comments to the data.

In this revision, I added “Vote02-ZoneSplit2andother.awk” for use at companies who nickname zones with additional text, such as Server_HBA and Server_Name_FA_etc_etc

Revision 512 – Vict Nickname Now Parses Cisco Device-Alias Database

The vict (VI Client Tool) “nickname” action now understands the output of a “show device-alias database” command on cisco.

In past, customers and users generate these files either by running the command in a plink.exe or ssh session (ie: the simplest way), or by capturing a log from their ssh session and running the command manually (ie the more-work way). Unfortunately, in the awk-based parser, replacing spaces with tabs blows away the parser, so this smarter parser should be able to handle that now.

My commit note was simply “Added basic parser for output of Cisco ‘show device-database alias’ behind VIClientTool’s –nickname= option”, and that says it all to ME, but to expand for others, this is what that means:

Recall that with multi-column CSVs, Brocade “alishow”, and Brocade “zoneshow”, a user can ignore the format of the file and just send it in with a “nickname” long-option or “N” for the short-option aficionados.

Now, if you give VICT a cisco “show device-alias database” output, it’ll make sense of that too.

The logic is the same: throw the content scattered at all the parsers, and see who gets the most results.

Although this and other parsers will get the reformatted output from a BNA ext ration (and soon, NetApp), only the basic parser ever gets results form those, so this new parser shouldn’t affect anything adversely. ..but without much forethought, the user gets another format to use.

Revision 502 – Obsessive Parser Retry

I’m not proud of this one. Please bear with be:

In revision 493, I wrote about how the parser for “–nickname=” actually pushes content to three separate parsers, and simply chooses the one with the best results. That was all about not trying to guess the content, but leave the guessing to the parsers. Whichever one gets the most, it wins. Too easy, and extremely scalable.

Problem is, the underlying Apache lib used to fork-off the incoming stream — to avoid downloading a file multiple times to parse it — that doesn’t always seem to work.

I put a lot of time and concern into trying to figure out why, but in the end, I just added a retry-counter.

When all parsers return a “shoot, I dunno” response, we simply run it again. And again. And again. …not so obsessive because we give up after 3 times, but you’re free to make it as psychotic/obsessive as you want.

To describe this, I verbosely wrote “add retries to the parsing so that we can thrash on a file if we need to just-get-it-done”

I promise to do better design in the future, but for now, this will only re-download a file for each full retry cycle. This doesn’t matter at all for file:// URLs, but for ftp://, bnapsql://, and http://, it will show up as multiple tries.

Revision 493 – Nickname Parses ZoneShow

For this revision, I wrote: “enable the –nickname= function to fork inbound content to a number of parsers; the one with the most results wins. Net result: the FAE or user needs not worry what they send to the tool, it will try to figure out what the file is. Supports user-selected columns in CSV, Brocade ZoneShow, and BNA”

What does that mean, in detail?

In past, the –nickname= fed directly into a single consumer that understand the user giving a “;WWN=x” or “;Nickname=y”, and uses those columns as input to nickname data.

Now, there are three parsers all feeding from the same resource, so even remote content (ie ftp:// and http:// URLs) is only downloaded once, but forked to many parsers. Without the user worrying about format, the three parsers try to interpret the stream to see what they can dig up. The “winning” parser is the one with the most results, effectively adapting to whatever the user sends it …of the three formats currently understood directly:

  • WWN,Nickname (WWN=x and Nickname=y are still effective)
  • Brocade zoneshow output (accuracy is challenged if the user sends a logdump of a screen-scraped output; for best results, treat the output as binary, and convert it directly using plink.exe or ssh, not a screen-capture of a log dump)
  • Brocade binary zone information from BNA versions 11 or 12

To re-iterate, the following URI types are understood:

  • http://www.example.com/file.ext
  • ftp://ftp.example.com/file.ext (anonymous FTP; have not tested user/pass)
  • file://current/directory/subdir/file.ext (same as .\current\directory\subdir\file.ext in windows)
  • file:///current/directory/subdir/file.ext (three slashes, same as \current\directory\subdir\file.ext in windows)
  • bnapsql://bna.example.com/
  • no URL: –nickname=sampleZone.zone (confirmed in testcases)

Revision 365 – Inflight BNAPSQL Changes

Revision 365 improves the BNAPSQL Client to avoid null nicknames, and avoid quoting those nicknames that are already quoted.

This was performed live with a customer on the conference call; really great debugging experience.

As a reminder:

 java -jar vict.jar --nickname=bnapsql://user:pass@server:port/resource

where the default is:

 java -jar vict.jar
   --nickname=bnapsql://dcmadmin:passw0rd@localhost:5432/dcmdb

…and if you just want to spit the nicknames right back out from two servers:

java -jar vict.jar
--nickname=bnapsql://server1/
--nickname=bnapsql://server2/
--nicknameout=output.csv

Use LUN Nicknames in VirtualWisdom to Identify VLUN SymmDevice Names

When some customers look at the output of our Hardware Probes such as the 8g FC8, and they’re confirming that all their Oracle transactions are meeting the SLA of 8ms, the LUNs they see are a bit unusual. They’re used to seeing LUNs such as LUN32, LUN33, but VirtualWisdom shows them LUNs like 17652, and they just don’t match up. This hinders the utility of the data, and may reduce the confidence they have in the data itself. When data drives decisions, we need accuracy and we need to be as easy to use as we can based on what we have available on the FC link.

The truth of the mismatch is that the devices involved actually convert the LUN numbers: where Virtual Wisdom shows you “17652″, that’s actually the LUN “on-the-wire”. The actual LUN in the SCSI frame is that large number, but vendor-specific tools convert it to manageable, familiar numbers — which can make the actual LUN appear “wrong”. Although it’s very simple to say “VirtualWisdom isn’t making the same conversion as your vendor-specific tools”, we’d rather be more helpful. When you have a problem on your SAN, “VirtualWisdom doesn’t support…” offers no help towards fixing the problem. Rather than “17652″, we’d rather tell you “Symm Device 1E2B”, which — for a very important customer of ours — is a comfortable middle-ground in identifiers and terms for the LUNs.

So how can we get a usable label on those LUNs? How do we do the lookups for you so that in an emergency, you have the details you need, already dereferenced?

As you’ve seen in the articles I’ve posted, I’m a Field Application Engineer for Virtual Instruments, and making this conversion — plus automating it — are the challenges I enjoy working on and sharing. This how-to article shares how to get SymmDevice aliases mapped onto LUNs as nicknames in a scriptable method. Let’s dive in:

Process Overview

For this process, our flow will look like the following:

In this diagram, the “sysinq” command is used to generate the “sysinq.txt” file; that part of the process is quite dependent on the tools available on your servers, hence shown dotted.

Overview

We at Virtual Instruments have witnessed this odd mismatch in LUNs for some time, and initially we just said “Subtract 0×4000″, but that didn’t work precisely. In another instance, we found two servers with a very similar number, so the informal rule became “subtract 17506″, but we later found that this rule was unusable outside that customer. Recently, in very detailed discussions, a customer found the logic that gets them to their LUN Nicknames, as follows:

The “sysinfo” file looks like this: (mocked-up example from testcases)

disk 1071 8/0/12/1/0.140.0.54.6.1.4 sdisk CLAIMED DEVICE EMC SYMMETRIX
lunpath 1071 8/0/12/1/0.0x50060482d5123456.0x4392000000000000 eslpt CLAIMED LUN_PATH LUN path for disk6219
disk 1071 8/0/12/1/0.140.0.54.6.1.4 sdisk CLAIMED DEVICE EMC SYMMETRIX
disk 1071 8/0/12/1/0.140.0.54.6.1.4 sdisk CLAIMED DEVICE EMC SYMMETRIX
lunpath 1071 8/0/12/1/0.0x50060482d5123456.0x4392000000000000 eslpt CLAIMED LUN_PATH LUN path for disk6219
lunpath 1071 8/0/12/1/0.0x50060482d5123456.0x4392000000000000 online

The key part of this file is the line that says:

lunpath ...0x50060482d5123456.0x4392... disk6219

In this case, “50060482d5123456″ is the storage device’s WWN, and 0x4392 is the LUN (17298 in decimal) we detect as the actual LUN used in the SCSI exchange for disk6219.

The “syminq” file looks for a certain server like this: (again, mocked-up example from testcases)

/dev/rdisk/disk6219 R1 000190102037 5773 3701E8B000 2096640
/dev/rdsk/c23t1d2 R1 000190102037 5773 3701E8B000 2096640
/dev/rdsk/c43t1d2 R1 000190102037 5773 3701E8B000 2096640
/dev/rdsk/c133t14d5 R1 000190102037 5773 3701E8B000 2096640

The key part of this file is the line that says:

...disk6219 ...3701E8B...

That rdisk entry is as follows: (keep in mind, the content is replaced by bogus values: some common values may not align anymore)

  • /dev/rdisk/disk6219 is a local device node on the host
  • I’m not sure what “R1″ stands for
  • 000190102037 is the Symmetrix serial number
  • I’m not sure what “5773″ means (model number?)
  • 3701E8B008 breaks down as:
    • 37 — last two digits of serial 000190102037
    • 0 — not sure
    • 1E8B — the SymmDevice ID
    • 000 — Director port number
  • 2096640 (2^21 -512) means the LUN has capacity ~2TB

I don’t have official information, but it’s possible that 01E8B is actually “Symmetrix device 01E”, and “Director #8B”. 0x8b larger than 16 used to mean Processor B, but that seems a bit outdated now. The tools we use could easily give out 5-digit SymmDevice-XXXX values if we could confirm this numbering breakdown.

Essentially, the SysInfo2LUNNickname.awk script does the following:

  1. pre-loads a syminq file to provide “better” nicknames where possible
  2. parses the sysinfo to realize nicknames, substituting syminq results where possible
  3. Output the results as a new-style (VirtualWisdom v2.1 or later) nickname CSV

The resulting nicknames.csv has a series of lines such as:

"LUN","50060482d5123456","17296","disk6217"
"LUN","50060482d5123456","17297","disk6218"
"LUN","50060482d5123456","17298","SymmDevice-1E8B"
"LUN","50060482d5123456","17299","disk6220"

As you can see, where there is a “better” match, “SymmDevice-1E8B” is used; otherwise, “diskXXXX” is still there. At this particular customer, “1E8B” is an example of a common name for their LUNs rather than the 17298 reported by VirtualInstruments or the 0×4392 reported in hex.

For exceptionally large syminq files, because gawk.exe is loading up a large associative array, memory usage may temporarily peak; this has been tested on 500k – 1MB text files without visible adverse effects, but hidden memory usage such as interpreters’ associative arrays and implicit memory-management is something to be aware of.

Running the Script

gawk.exe -v SYMINQ=syminq.txt -f SysInfo2LUNNickname.awk sysinfo.txt > nicknames.csv

It’s that simple. In order to improve re-use, I collected this logic as the AWK script because I’ve never had portability problems with AWK except for the UNIX/Windows CR/CRLF/LF text line-endings debate.

The script is non-interactive, but prints the results, so the script is typically run with the output redirected to a file. Running it is very quiet, as you’d expect:

gawk execution screencap showing LUN Nickname creation

Complete Example

A complete example is difficult for this how-to because the files used are scattered on each server; collecting the sysinfo, and the syminq output from each server may be the more difficult part. My test example looks like the following (I always recommend using full pathnames for tools):

@echo off
cd \VirtualWisdomData\

REM following is all on one line but split here for easier reading
C:\UnxUtils\usr\local\wbin\gawk.exe
-v SYMINQ=\sandata\syminq01.txt
-f \sandata\SysInfo2LUNicknames.awk
\sandata\sysinfo > \VirtualWisdomData\DeviceNickname\nicknames.csv

Similar to previous Nickname-generation/derivation tutorials, this process generates a single nicknames.csv file. You could append this result to any other generated file for which you already have an import schedule, or create a new one. In order to be as complete as possible, I’ve included an import schedule example that should be surprisingly similar to the others (and similarly brief: the User Guide has more detail regarding Schedules)

  1. Views Application, Setup tab: Views Application, Setup Tab
  2. “Schedules” page, roughly 5th item down: Views Application, Setup Tab, Schedules page
  3. Create a new Schedule, with the action “Import WWN Nicknames”: (or, if you prefer, “Import LUN Nicknames”) Views Application, Setup Tab, Nickname Import Schedule
  4. …and configure it to use a new WWN Importing Configuration, as follows. NOTE we only use a local filename, all files are in the \DeviceNickname directory of your VirtualWisdomData folder:Nickname Import configuration, nicknames.csv