<?xml version='1.0' encoding='utf-8'?>

<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>

<?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?>
<?rfc toc="yes"?>
<?rfc sortrefs="yes"?>
<?rfc symrefs="yes"?>
<?rfc comments="yes"?>

<rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" docName="draft-janzking-nmrg-telemetry-instrumentation-00" category="info" obsoletes="" updates="" submissionType="IRTF" xml:lang="en" tocInclude="true" sortRefs="true" symRefs="true" version="3">

  <front>
    <title abbrev="Tele Methods Analog Measurement">Telemetry Methodologies for Analog Measurement Instrumentation</title>
    <seriesInfo name="Internet-Draft" value="draft-janzking-nmrg-telemetry-instrumentation-00"/>

    <author initials="C." surname="Janz" fullname="Chris Janz">
      <organization>Huawei Canada</organization>
      <address>
        <email>christopher.janz@huawei.com</email>
      </address>
    </author>

    <author initials="D." surname="King" fullname="Daniel King">
      <organization>Lancaster University</organization>
      <address>
        <email>d.king@lancaster.ac.uk</email>
      </address>
    </author>

    <date year="2024"/>

    <workgroup>Internet Research Task Force</workgroup>
    <keyword>network management</keyword> <keyword>telemetry</keyword> <keyword>instrumentation</keyword>

    <abstract>

      <t>Evolution toward network operations automation requires systems encompassing software-based 
         analytics and decision-making. Network-based instrumentation provides crucial data for these
         components and processes. However, the proliferation of such instrumentation and the need to
         migrate the data it generates from the physical network to "off-the-network" software, poses
         challenges. In particular, analog measurement instrumentation, which generates time-continuous
         real number data, may generate significant data volumes.</t>

      <t>Methodologies for handling analog measurement instrumentation data will need to be identified 
         and discussed, informed in part by consideration of requirements for the operation of network
         digital twins, which may be important software-realm consumers of such data.</t>

    </abstract>

  </front>

  <middle>

    <section anchor="intro" numbered="true" toc="default">
      <name>Introduction</name>

      <t>Existing studies for network telemetry typically deal with packet-oriented measurements for 
         generating packet traffic, path, discard, latency and other data <xref target="RFC7799" format="default"/>, <xref target="OPSAWG-IFIT-FRAMEWORK" format="default"/>. 
         However, some networking equipment and network operations scenarios feature or use more physically-oriented 
         measurement instrumentation that generates data of a different character. Here, the particularities of 
         data generated by such "analog" instrumentation are examined, and telemetry methodologies suitable for 
         such data are considered. This consideration is informed by the requirements of specific use cases, 
         including network digital twins. </t>

      <t>Optical networks, which are increasingly rich in analog instrumentation, are used as a specific example here. 
         But the telemetry methodologies discussed may apply to instrumentation and telemetry intersecting a wide 
         variety of networks and their related operational software, for example, in support of digital twins that 
         provide modeling of radio-based transmission, thermal characteristics or energy consumption.</t>
  
      <t>This document presents telemetry methodologies tailored for analog measurement instruments, 
         aiming to enhance data accuracy, transmission efficiency, and real-time monitoring capabilities 
         for network digital twins. The findings underscore the potential of these methodologies to for 
         best practice for telemetry digital twin networks that require analog measurement instruments. 
         It provides a state-of-the-art summary, including gaps and possible areas for further research</t>

      </section>

      <section anchor="term" numbered="true" toc="default">
        <name>Terminology</name>

        <dl newline="false" spacing="normal">


          <dt>Resource:</dt>
          <dd>Any feature, including connectivity, buffers, compute,
              storage, and content delivery that forms part of or can be accessed
              through a network.  Resources may be shared between users, applications,
              and clients, or they may be dedicated for use by a unique customer.</dd>

          <dt>Infrastructure Resources:</dt>
          <dd>The hardware and software for
              hosting and connecting SFs.  These resources may include computing
              hardware, storage capacity, network resources (e.g., links and
              switching/routing devices enabling network connectivity), and
              physical assets for radio access.</dd>

        </dl>

      </section>

    <section anchor="bkgnd" numbered="true" toc="default">
      <name>Background</name>

      <t>Photonic networks, which transmit data through light signals via fiber optic cables, 
         are fundamental to telecommunications, internet services, data center operations, and 
         many other critical aspects of modern digital infrastructure. A range of measurement 
         instruments are routinely used in the deployment and maintenance of these networks.
         Key examples include:</t>

      <t>The concept of network slicing is a key capability to serve a customer
         with a wide variety of different service needs expressed as SLOs/SLEs in terms of, e.g.,
         latency, reliability, capacity, and service function-specific capabilities.</t>

      <t>This section outlines the key capabilities required to realize network slicing
         in a TE-enabled IETF technology network.</t>
         
        <ul spacing="normal">

          <li>Optical Time Domain Reflectometers (OTDRs): These devices are used to test the 
              integrity of fiber optic cables by sending a series of light pulses into the 
              fiber and measuring the light that is scattered or reflected back. OTDRs can 
              detect and locate faults, splices, and bends in fiber optic cables, and are 
              crucial for both installation and troubleshooting;</li>

          <li>Optical Spectrum Analyzers (OSAs): OSAs measure the power spectrum of optical 
              devices to analyze the wavelength or frequency distribution of light. They are 
              vital for characterizing the performance of components like lasers and optical 
              amplifiers within the network;</li>              

          <li>Optical Power Meters and Light Sources: Used in tandem, these instruments measure
              the loss or attenuation in optical fibers and verify the power levels to ensure 
              that signals are transmitted with sufficient strength without exceeding the damage
              threshold of the network components;</li>     

          <li>Network Analyzers and Bit Error Rate Testers (BERTs): These tools assess the overall
              performance of the optical network by analyzing parameters such as signal integrity, 
              bit error rates, and network latency. They help in ensuring that the network can 
              reliably handle the intended data loads;</li>  

          <li>Wavelength Division Multiplexing (WDM) Analyzers: WDM technology combines multiple 
              optical carrier signals on a single optical fiber by using different wavelengths. 
              WDM analyzers are specialized tools for testing and maintaining these systems, 
              ensuring that each channel is transmitted efficiently without interference;</li>  
         
          <li>Dispersion Analyzers: These are used to measure chromatic and polarization mode
              dispersion in fiber optic cables, which can affect the quality and speed of data 
              transmission. Managing dispersion is crucial for long-distance and high-data-rate 
              optical communications.</li>  
        </ul>
         
      <t>These instruments play a critical role in the characterization, deployment, optimization, 
         and troubleshooting of optical networks. But their use tends to be restricted to specific
         operational phases, requires manual operation, and is generally not compatible with 
         application to operating facilities. The term instrumentation refers more properly to 
         "embedded" capability that is both operable on active infrastructure and capable of 
         continuous measurement operation. Such instrumentation is a necessary foundation for 
         telemetry</t>         
      </section>


    <section anchor="meas" numbered="true" toc="default">
        <name>Optical Network Measurement Instrumentation</name>

        <t>Optical network instrumentation has typically focused on detecting transmission performance degradation, 
           through measurement of error correction rates in FEC engines, counting of errored OTN frames, etc. 
           Such measurements are typically executed on network elements through time-interval-based counting. 
           The resulting counts may be forwarded to or collected by software on a subscription or polling basis. 
           The data consists of series of integer numbers, or series of time stamp-integer number couplets.</t>
           
        <t>In recent years, however, the nature and scope of optical network instrumentation has broadened and deepened <xref target="JIANG" format="default"/>. 
           The idea has been to instrument the optical network more richly to support more effective operations management, 
           including using software-based analytics and modeling. Implicated network operations include network and connection 
           planning and configuration, network and connection fault management (fault and impairment detection, classification, 
           localization, preemption, correction), and others.</t> 
         
        <t>The optical network is a high-performance analog transmission network, so, unsurprisingly, much of this new instrumentation
           is analog; that is, it produces time-continuous real-number data or data sets. Examples include optical loss, optical power 
           (total, channel peak, etc.), optical spectra (narrow-band-filtered power measured at a series of center wavelengths), 
           differential group delay (DGD), polarization mode dispersion (PMD), polarization dependent loss (PDL), Stokes vector 
           components reflecting state of polarization (SOP), linear optical signal-to-noise ratio (OSNR) and generalized optical 
           signal-to-noise ratio (GSNR). Many of these measurements are synthesized by coherent receivers across the network, 
           while some may be synthesized by in-span elements such as amplifiers and ROADMs.</t>                  
     </section>

    <section anchor="uc" numbered="true" toc="default">
        <name>Telemetry Use Cases</name>

        <t>One application of this data in the software realm is with optical network digital twins (NDTs), used for transmission 
           performance modeling <xref target="JANZ" format="default"/>, <xref target="NMRG-PODTS" format="default"/>. Such NDTs constitute an important class of analytical engine supporting optical
           network and service planning and other operations, and they rely heavily on data from network instrumentation to enable
           accurate modeling of optical transmission performance on targeted variations of the actual network and service configuration,
           state and condition. A default expectation would be that all instrumentation measurements are reflected continuously in the 
           software realm for use by optical NDTs. However, at best only an approximation to this can be achieved (e.g., only a series 
           of sampled measurements may in fact be streamed from the network), so the imperative is to find efficient ways to support 
           sufficiently-accurate such approximations. This imperative grows more compelling the greater the scale of the network and 
           the greater the richness of embedded instrumentation.</t>
           
        <t>A second example application lies in the fault management domain, wherein analysis of rich data, concentrated around the time
           of a detected evolution in transmission conditions, may be used to classify and localize the origin of the observed evolution <xref target="HAHN" format="default"/>. 
           Transient evolutions of transmission performance are commonplace on optical networks and have myriad causes, including extrinsic 
           causes such as lightning strikes, earthworks and construction, weather, road and rail traffic, fires, etc., as well as intrinsic 
           causes including continuous or discrete deteriorations to equipment or fibre plant. Detection, classification, and localization of 
           transmission performance evolutions permit assessment of the likelihood, expected severity, and rate of further deterioration, and 
           planning of timely and cost-effective corrective interventions where indicated. However, successful analysis may depend on the 
           availability of richer data sets in software that may be supported by continuous streaming or required by other applications.</t>  
           
     </section>

    <section anchor="req" numbered="true" toc="default">
        <name>Analog Measurement Requirements</name>

        <t><xref target="RFC9232" format="default"/>provides a framework for considering concepts, constructs and developments in network telemetry. Many of the 
           methods and mechanisms it discusses or suggests are invoked here.</t>

      <section anchor="sampling">
        <name>Sampling</name>
        <t>An analog-to-digital conversion process typically converts analog signals into digital data that can be transmitted, stored, and processed more efficiently. 
         This often involves sampling the signal at a certain rate and quantizing the amplitude into digital values. The "mirroring" (transmission for replication at a 
         different place) of continuous-time real number data, generated by in-network instrumentation, begins with sampling and representing measured values by a scalar 
         or vector of finite-decimal-place numbers. As neither sampling at fixed intervals, nor fixed time alignment or offset among measurement points in the network or 
         between such points and the off-network software realm, can generally be assumed; it is useful that instrumentation should generate, as primary data, a series 
         of couplets or vectors consisting of sample time stamps and corresponding measured data values.</t>

       </section>

      <section anchor="precision">
        <name>Time Precision</name>
         <t>Inadequate sampling frequency and quantization error are both potential sources of error, in the - literal or effective - "reconstruction"" of the original 
           time-continuous measurement in the software realm. It is possible that sampling frequencies might be varied in response to evolving temporal characteristics of
           measured parameters; this is one strategy for data reduction (and one reason why sampling may not occur at fixed-period intervals).</t>
          
         <t>Requirements on the precision of reconstructed data, its time basis, and the alignment in time of different reconstructed measurements; are determined by the 
            operational role played by the analytical functions that consume the data. Some operations of interest, such as network and service planning or fault and 
            impairment management, may impose only relatively relaxed requirements on time synchronization among measurement instruments, and between those instruments
            and the software realm.  Other applications, e.g., those concerning operations tending toward closed loop control, may require tighter temporal data alignment
            among different measurement sources. These considerations have implications in terms of source and synchronization of clocks producing time stamps; but in
             general, requirements on clock synchronization and precision are far from those required for bit-level operations: i.e. they are generally more like 
             "network time" than "digital time".</t> 

         <t>Similarly, requirements on the absolute or relative (i.e. among different measurement instruments) precision of reconstructed measured data values may be
            application-dependent. In many cases, relative precision, or precision consistency, may be more important than absolute precision.</t>

      </section>
 
       <section anchor="reduction">
        <name>Reduction and Other Pre-Processing</name>
         <t>With telemetric data volume a primary potential challenge, methods for reducing data volume associated with analog measurement instrumentation are of evident 
            interest. Signals may also be filtered to remove noise and unwanted frequencies to improve the data quality.</t>
          
      </section>

      <section anchor="compression">
        <name>Compression</name>
         <t>Data compression is an obvious candidate methodology for bandwidth reduction. Methods for lossless compression of series of numerical data have been widely studied, 
            e.g. <xref target="RATANAWORABHAN" format="default"/>.</t>
          
         <t>Obviously, such compression must be implemented as a "pre-processing" function executed by the telemetric instrumentation itself, or some proxy to it. Similarly, 
            decompression must be implemented as a "post-processing" function within the software realm. Where time stamps are uncompressed, depending on the compression 
            methodology employed, it may be possible to support selective decompression of data, e.g., only on selected time intervals. This might allow for application-driven 
            "as-required" post-processing (decompression) of more limited volumes of telemetric data.</t> 

         <t>The compressibility of time-based data depends on its evolution in data-entropic terms, resulting in streamed data flows of varying volume or rate. The effective 
            transmission and reception rates of data samples thus may vary and differ at any point from the rate of data generation. This is another reason why data samples 
            may require time stamps.</t>
            
        <t>Other forms of effective data reduction through pre-processing may also be useful, or preferred:</t>
 
         <ul spacing="normal">

          <li>Thresholding: Data samples are transmitted only if and when a measured value, or a derivative of the measured value, crosses a threshold. Possible examples include: 
              a) exceeding some absolute or proportional variation from the last transmitted sample value; b) exceeding a previously observed and transmitted maximum or minimum value;
              or, c) exceeding some time rate-of-change of the measured value.</li>
              
          </ul>

          <t>Post-processing of threshold-driven data may or may not be required by applications. For example, an application may generate a scenario for behavioral analysis by an 
             NDT that requires the "current" data from network instrumentation. To whatever precision is effectively reflected in the details of the operating thresholding mechanisms, 
             that data is simply the most recently transmitted sample from network measurement instruments. Another application, however, perhaps one dealing with fault or impairment 
             management, might require a regular and continuous time series presentation of measured data. In that case, e.g. interpolation or other post-processing of received data 
             samples might be needed.</t>
             
          <t>Other kinds of pre-processing may also be interest, including normalization of data, frequency domain conversion, and computation of statistics.</t>

         <ul spacing="normal">

          <li>Triggering: An extension or variation of thresholding, triggering may refer to, e.g. the transmission of a series of samples - from a defined set of 
              measurement instruments, over a defined period of time and at defined time intervals - on crossing of a particular threshold (i.e., that threshold 
              crossing "triggers" the transmission of the defined data series). Triggering of this kind may be useful in e.g. fault and impairment management. The detection 
              by instrumentation of some pre-defined circumstance or occurrence - e.g. observation of an unusually large or rapid change in an optical power level or channel SOP - 
              would trigger the transmission of a pre-defined, "rich" set of data covering a time interval around the triggering observation. That data could then be subjected to 
              various forms of "forensic" analysis in software to support detection, classification or localization of transmission performance-impacting events. Required 
              pre-processing includes processing of triggers, and the sliding storage of instrumentation data sample values sufficient to cover the targeted data capture time "window" 
              as well as trigger processing and transmission intervals.</li>  
          </ul>

       </section>   
 
      <section anchor="streaming">
        <name>Programmable Streaming</name>
         <t>As discussed in <xref target="RFC9232" format="default"/>, in-network pre-processing of telemetry data may usefully be "programmed" by telemetry clients (i.e., software applications that are consumers of 
            instrumentation data), including dynamically or variably. The range and nature of software applications and their data requirements may vary among systems, may evolve with 
            time within any given system - based on experience and learning (automated or not) or with the deployment of new capabilities - and may also vary as a function of available 
            instrumentation capabilities on a given network, which themselves may evolve.</t>
          
       </section>          

      <section anchor="polling">
        <name>Streaming versus Polling</name>
         <t>Streaming - i.e., subscription-based push - is, as identified in <xref target="RFC9232" format="default"/> and other works, and as suggested by the discussion above, expected to be the principal, 
            if not exclusive, operational modality for telemetry, including analog instrumentation telemetry. Software clients consume data generated by the network, and having 
            identified which data they require and from where within the network, use subscriptions to place themselves in a position to receive it, on an ongoing basis, without 
            continuing operational steps.</t>
         
         <t>Triggered transmission of "batched" data is aligned with a streaming paradigm, as the telemetry server (i.e., instrumentation) must detect the trigger conditions and 
            react by capturing and transmitting data to subscribing clients.</t>
         
         <t>It is worth considering, however, whether polling can or should be completely dispensed with, or whether it might retain some utility in some cases or circumstances.</t>
         
         <t>The discussion so far supports a view that the data needs of NDTs can be satisfied, and in fact probably are best served by, streaming. However, polling could be used if 
            NDT-based analyses are required relatively infrequently, do not require very rapid execution, and do not draw arbitrarily on historical data. Polling might also be useful 
            as a complementary mechanism to streaming. For example, to reduce data transmission and handling volumes, an NDT might choose to unsubscribe from telemetry it has observed 
            changes little with time. However, for particularly critical analyses, the NDT might want to ensure that all available telemetry data is up-to-date, by polling the 
            unsubscribed instrumentation. Further, if certain kinds of data compression are used, decompression processes can enter into errored regimes e.g. through transmission 
            loss of telemetry data. Periodic polling may be useful to "re-set" absolute data values in such cases. In fact, as suggested in <xref target="RFC7799" format="default"/>, the possibility of transmission 
            loss of streamed telemetry packets, a concern particularly if unreliable transport paradigms such as UDP are used, may provide a general reason to enable polling as a 
            "failsafe" mechanism. </t>
                   
       </section>   

      <section anchor="protocols">
        <name>Communication Protocols</name>
         <t>Communication protocols facilitate the reliable data exchange between telemetry devices and control systems. Depending on the method, streaming and/or polling, 
            various messaging protocols exist to provide efficient delivery of instrumentation data.</t>
       </section>   

      <section anchor="models">
        <name>Data Models</name>
         <t>A complete framework for analog instrumentation telemetry might require data models supporting:</t>
            
        <ul spacing="normal">

          <li>Identification of instrumentation-equipped and telemetry-capable network equipment, the latter's available instrumentation, 
              its available pre-processing, and what aspects of available pre-processing are programmable;</li>

          <li>Subscription to streaming from specific instrumentation;</li>              

          <li>Programming (or re-programming) of pre-processing on specific subscriptions and instrumentation, including type of pre-processing, 
              applicable thresholds or triggers, and definition of trigger-associated data sets (included data and start/stop interval limits 
              vs. triggering events);</li>     

          <li>Transmission of applicable time stamp-data value couplets, vectors or batches.</li>  

        </ul>            

       </section>   
      
    </section>
       
    <section anchor="IANA" numbered="true" toc="default">
      <name>IANA Considerations</name>

        <t>This document makes no requests for action by IANA.</t>

    </section>

    <section anchor="ops" numbered="true" toc="default">
      <name>Operational Considerations</name>

      <t>Operational considerations for Optical Network Measurement Instrumentation involve a range of factors to ensure accurate, 
         reliable, and efficient performance of the optical networks. These considerations are critical for deploying, maintaining, 
         and troubleshooting fiber optic systems. Key operational considerations include:</t>  

        <ul spacing="normal">

          <li>Calibration and Signal Integrity</li>

          <li>Dynamic Range and Sensitivity</li>              

          <li>Resolution and Accuracy</li>     

          <li>Scalability</li>  

          <li>Bandwidth and storage of instrumentation data</li>  

        </ul>            

      <t>Future version of this document will expand on the topics above and increase the scope of operational considerations. </t>  

    </section>

    <section anchor="sec" numbered="true" toc="default">
      <name>Security Considerations</name>

      <t>The security implications of optical network telemetry are critical, given the increasing reliance on optical networks for data transmission
         in various sectors. Ensuring the security and integrity of these networks and thetelemetry instrumentation used to measure and maintain them is 
         paramount to prevent unauthorized access, data breaches, potential service disruptions, and use as possible threat vectors and attack surfaces.</t> 
         
      <t>Key security considerations include:</t>  

        <ul spacing="normal">

          <li>Encryption of sensitive telemetry data</li>

          <li>Secure configuration and management of telemetry functions</li>              

          <li>Network monitoring and anomaly detection</li>     

          <li>Secure data handling and storage</li>  

        </ul>    
        
       <t>Future version of this document will expand on the topics above and increase the scope of security considerations.</t>         

    </section>

    <section anchor="ACK" numbered="true" toc="default">
      <name>Acknowledgements</name>

      <t>Thanks to discussions in the Network Digital Twin discussions Network Management Research Group that provided further input into this work.</t>

      <t>This work is supported by the UK Department for Science, Innovation and Technology under the Future Open Networks Research Challenge project TUDOR 
         (Towards Ubiquitous 3D Open Resilient Network). The views expressed are those of the authors and do not necessarily represent the project</t>
    </section>


  </middle>

  <back>

  <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
      </references>
      <references title="Informative References">
        <?rfc include="reference.RFC.7799.xml"?>
        <?rfc include="reference.RFC.9232.xml"?>
        <reference anchor="OPSAWG-IFIT-FRAMEWORK" target="https://datatracker.ietf.org/doc/html/draft-song-opsawg-ifit-framework-21">
          <front>
            <title>Framework for In-Situ Flow Information Telemetry</title>
            <author>
              <organization>IETF</organization>
            </author>
            <date year="2023" month="October" day="1"/>
          </front>
        </reference>
        <reference anchor="NMRG-PODTS" target="https://datatracker.ietf.org/doc/draft-paillisse-nmrg-performance-digital-twin/02">
          <front>
            <title>Performance-Oriented Digital Twins for Packet and Optical Networks</title>
            <author>
              <organization>IETF</organization>
            </author>
            <date year="2023" month="October" day="1"/>
          </front>
        </reference>
        <reference anchor="JIANG" target="https://opg.optica.org/jlt/abstract.cfm?uri=jlt-40-10-3128">
          <front>
            <title>Progresses of Pilot Tone Based Optical Performance Monitoring in Coherent Systems</title>
            <author>
              <organization>Journal of Lightwave Technology, vol. 40, No. 10, pp. 3128-3136</organization>
            </author>
            <date year="2023" month="October" day="1"/>
          </front>
        </reference>
        <reference anchor="JANZ" target="https://ieeexplore.ieee.org/document/9789844">
          <front>
            <title>Digital Twin for the Optical Network: Key Technologies and Enabled Automation Applications</title>
            <author>
              <organization>IEEE/IFP Network Operations and Management Symposium, Workshop of Technologies for Network Twins</organization>
            </author>
            <date year="2022" month="April" day="1"/>
          </front>
        </reference>
        <reference anchor="HAHN" target="">
          <front>
            <title>On the Spatial Resolution of Location-Resolved Performance Monitoring by Correlation Method</title>
            <author>
              <organization>Optical Fiber Communications</organization>
            </author>
            <date year="2023" month="March" day="1"/>
          </front>
        </reference>
        <reference anchor="RATANAWORABHAN" target="">
          <front>
            <title>Fast Lossless Compression of Scientific Floating-Point Data</title>
            <author>
              <organization>Data Compression Conference</organization>
            </author>
            <date year="2006" month="May" day="1"/>
          </front>
        </reference>
    </references>
    </references>

  </back>

</rfc>
