We recommend switching to the latest versions of Edge, Firefox, Chrome or Safari. Using Internet Explorer will result in a loss of website functionality.


1 comment

  • Avatar
    Adrian Williams

    There is one solution:

    The first action is to sort the input data by 'SALARY' ascending. Then, any records with Null 'SALARY' values are split out of the data. 

    The calculate fields node is used to create a 'Rec_ID' field (zero-based).

    A Transform node is used to identify the record with the closest salary.

    The ConfigureFields script defines the output metadata and initializes an array to store the input record values.

    In the ProcessRecords script, the fields from each input record are placed into an array and the array is appended to the salary_array (i.e. to form a list of lists):

    When all of the input records have been added to the list, the total number of records is determined:

    Then the logic loops through each record in the salary_array and the id of the current record is extracted from the record's array:

    If this is the first record in the array then it must have the lowest 'SALARY' value so the closest value must be in the next record. The values are explicitly written out:

    If we are processing a subsequent record, but it is not the last record in the array then we need to get the difference between the salary value in the current record and the previous record's salary value, and the difference between the salary value in the current record and the next record's salary value. Then these differences are compared to determine which is the closest. If the differences are the same the previous record is selected as the closest value.

    If the record counter was not an intermidiate value then we must be processing the last record so the closest must be the previous record's values:

    This example solution is included in the attached data flow.

    Note: In this solution the values for all input records are being stored in a Python array - which is stored in memory - so it may not be suitable for processing very large data sets.



    Attached files

    Find_Record_w_Closest_Value - 5 Aug 2021.lna


    Comment actions Permalink

Please sign in to leave a comment.