^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) =================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) Linux API for read access to z/VM Monitor Records
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) =================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) Date : 2004-Nov-26
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) Author: Gerald Schaefer (geraldsc@de.ibm.com)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) Description
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) ===========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) This item delivers a new Linux API in the form of a misc char device that is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) usable from user space and allows read access to the z/VM Monitor Records
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) collected by the `*MONITOR` System Service of z/VM.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) User Requirements
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) =================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) The z/VM guest on which you want to access this API needs to be configured in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) order to allow IUCV connections to the `*MONITOR` service, i.e. it needs the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) IUCV `*MONITOR` statement in its user entry. If the monitor DCSS to be used is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) restricted (likely), you also need the NAMESAVE <DCSS NAME> statement.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) This item will use the IUCV device driver to access the z/VM services, so you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) need a kernel with IUCV support. You also need z/VM version 4.4 or 5.1.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) There are two options for being able to load the monitor DCSS (examples assume
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) that the monitor DCSS begins at 144 MB and ends at 152 MB). You can query the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) location of the monitor DCSS with the Class E privileged CP command Q NSS MAP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) (the values BEGPAG and ENDPAG are given in units of 4K pages).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) See also "CP Command and Utility Reference" (SC24-6081-00) for more information
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) on the DEF STOR and Q NSS MAP commands, as well as "Saved Segments Planning
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) and Administration" (SC24-6116-00) for more information on DCSSes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) 1st option:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) You can use the CP command DEF STOR CONFIG to define a "memory hole" in your
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) guest virtual storage around the address range of the DCSS.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) Example: DEF STOR CONFIG 0.140M 200M.200M
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) This defines two blocks of storage, the first is 140MB in size an begins at
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) address 0MB, the second is 200MB in size and begins at address 200MB,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) resulting in a total storage of 340MB. Note that the first block should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) always start at 0 and be at least 64MB in size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) 2nd option:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) Your guest virtual storage has to end below the starting address of the DCSS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) and you have to specify the "mem=" kernel parameter in your parmfile with a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) value greater than the ending address of the DCSS.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) Example::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) DEF STOR 140M
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) This defines 140MB storage size for your guest, the parameter "mem=160M" is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) added to the parmfile.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) User Interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) ==============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) The char device is implemented as a kernel module named "monreader",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) which can be loaded via the modprobe command, or it can be compiled into the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) kernel instead. There is one optional module (or kernel) parameter, "mondcss",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) to specify the name of the monitor DCSS. If the module is compiled into the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) kernel, the kernel parameter "monreader.mondcss=<DCSS NAME>" can be specified
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) in the parmfile.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) The default name for the DCSS is "MONDCSS" if none is specified. In case that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) there are other users already connected to the `*MONITOR` service (e.g.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) Performance Toolkit), the monitor DCSS is already defined and you have to use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) the same DCSS. The CP command Q MONITOR (Class E privileged) shows the name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) of the monitor DCSS, if already defined, and the users connected to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) `*MONITOR` service.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) Refer to the "z/VM Performance" book (SC24-6109-00) on how to create a monitor
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) DCSS if your z/VM doesn't have one already, you need Class E privileges to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) define and save a DCSS.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) Example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) --------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) modprobe monreader mondcss=MYDCSS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) This loads the module and sets the DCSS name to "MYDCSS".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) NOTE:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) -----
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) This API provides no interface to control the `*MONITOR` service, e.g. specify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) which data should be collected. This can be done by the CP command MONITOR
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) (Class E privileged), see "CP Command and Utility Reference".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) Device nodes with udev:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) -----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) After loading the module, a char device will be created along with the device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) node /<udev directory>/monreader.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) Device nodes without udev:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) If your distribution does not support udev, a device node will not be created
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) automatically and you have to create it manually after loading the module.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) Therefore you need to know the major and minor numbers of the device. These
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) numbers can be found in /sys/class/misc/monreader/dev.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) Typing cat /sys/class/misc/monreader/dev will give an output of the form
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) <major>:<minor>. The device node can be created via the mknod command, enter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) mknod <name> c <major> <minor>, where <name> is the name of the device node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) to be created.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) Example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) --------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) # modprobe monreader
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) # cat /sys/class/misc/monreader/dev
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) 10:63
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) # mknod /dev/monreader c 10 63
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) This loads the module with the default monitor DCSS (MONDCSS) and creates a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) device node.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) File operations:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) ----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) The following file operations are supported: open, release, read, poll.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) There are two alternative methods for reading: either non-blocking read in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) conjunction with polling, or blocking read without polling. IOCTLs are not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) supported.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) Read:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) -----
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) Reading from the device provides a 12 Byte monitor control element (MCE),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) followed by a set of one or more contiguous monitor records (similar to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) output of the CMS utility MONWRITE without the 4K control blocks). The MCE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) contains information on the type of the following record set (sample/event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) data), the monitor domains contained within it and the start and end address
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) of the record set in the monitor DCSS. The start and end address can be used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) to determine the size of the record set, the end address is the address of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) last byte of data. The start address is needed to handle "end-of-frame" records
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) correctly (domain 1, record 13), i.e. it can be used to determine the record
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) start offset relative to a 4K page (frame) boundary.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) See "Appendix A: `*MONITOR`" in the "z/VM Performance" document for a description
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) of the monitor control element layout. The layout of the monitor records can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) be found here (z/VM 5.1): https://www.vm.ibm.com/pubs/mon510/index.html
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) The layout of the data stream provided by the monreader device is as follows::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) <0 byte read>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) <first MCE> \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) <first set of records> |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) ... |- data set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) <last MCE> |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) <last set of records> /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) <0 byte read>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) There may be more than one combination of MCE and corresponding record set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) within one data set and the end of each data set is indicated by a successful
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) read with a return value of 0 (0 byte read).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) Any received data must be considered invalid until a complete set was
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) read successfully, including the closing 0 byte read. Therefore you should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) always read the complete set into a buffer before processing the data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) The maximum size of a data set can be as large as the size of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) monitor DCSS, so design the buffer adequately or use dynamic memory allocation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) The size of the monitor DCSS will be printed into syslog after loading the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) module. You can also use the (Class E privileged) CP command Q NSS MAP to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) list all available segments and information about them.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) As with most char devices, error conditions are indicated by returning a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) negative value for the number of bytes read. In this case, the errno variable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) indicates the error condition:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) EIO:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) reply failed, read data is invalid and the application
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) should discard the data read since the last successful read with 0 size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) EFAULT:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) copy_to_user failed, read data is invalid and the application should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) discard the data read since the last successful read with 0 size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) EAGAIN:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) occurs on a non-blocking read if there is no data available at the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) moment. There is no data missing or corrupted, just try again or rather
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) use polling for non-blocking reads.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) EOVERFLOW:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) message limit reached, the data read since the last successful
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) read with 0 size is valid but subsequent records may be missing.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) In the last case (EOVERFLOW) there may be missing data, in the first two cases
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) (EIO, EFAULT) there will be missing data. It's up to the application if it will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) continue reading subsequent data or rather exit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) Open:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) -----
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) Only one user is allowed to open the char device. If it is already in use, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) open function will fail (return a negative value) and set errno to EBUSY.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) The open function may also fail if an IUCV connection to the `*MONITOR` service
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) cannot be established. In this case errno will be set to EIO and an error
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) message with an IPUSER SEVER code will be printed into syslog. The IPUSER SEVER
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) codes are described in the "z/VM Performance" book, Appendix A.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) NOTE:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) -----
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) As soon as the device is opened, incoming messages will be accepted and they
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) will account for the message limit, i.e. opening the device without reading
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) from it will provoke the "message limit reached" error (EOVERFLOW error code)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) eventually.