import numpy as np
import pandas as pd
from lcs.representations.RealValueEncoder import RealValueEncoder
from myst_nb import glue


def encode(p, bits):
    return int(RealValueEncoder(bits).encode(p))


def build_encoding_df():
    r = np.arange(0, 1.1, .1)
    df = pd.DataFrame(r, columns=['Perception'])

    for bit in [1, 2, 3, 4, 5, 6, 7]:
        df[f'{bit}-bit'] = df.apply(lambda row: encode(row['Perception'], bit), axis=1)

    df.set_index('Perception', inplace=True)
    return df


glue("hybrid_interval_df", build_encoding_df(), display=False)

3.1. Interval-based representation

There are several possible ways of representing intervals as described in Section “Interval predicates”. This chapter deliberately selects the UBR approach introduced by Stone and Bull for XCS [34], since it supersedes both CSR and OBR encodings and tries to apply it within the ALCS algorithm.

However, any agent cannot be adapted to UBR just by changing the communication layer with the environment. Therefore, the ACS2 algorithm was elected as a representative candidate of the ALCS family due to its maturity and the considerable number of tightly interacting components.

Moreover, to limit the flourishing population size, a hybrid approach was taken to represent intervals, where each boundary is represented using a nominal integer value within a certain range. It is assumed that the input attribute perception \(\sigma_i\) is defined as \(\sigma_i \in [0, 1]\). Fig. 3.1 demonstrates an example of encoding real value input using particular encoding resolutions. Obviously, low encoding values introduce ambiguity, and this parameter must be chosen carefully.

1-bit 2-bit 3-bit 4-bit 5-bit 6-bit 7-bit
Perception
0.0 0 0 0 0 0 0 0
0.1 0 0 0 1 3 6 12
0.2 0 0 1 3 6 12 25
0.3 0 1 2 4 9 19 38
0.4 0 1 3 6 12 25 51
0.5 1 2 4 8 16 32 64
0.6 1 2 4 9 19 38 76
0.7 1 2 5 11 22 44 89
0.8 1 3 6 12 25 51 102
0.9 1 3 7 14 28 57 115
1.0 1 3 7 15 31 63 127

Fig. 3.1 Examples of encoded values for different perceptions \(\sigma\) values. Maximum resolution is calculated with \(2^n\), where \(n\) is the number of bits used. E.g. a range \([0.3;0.6]\) encoded with 7 bits would be \([38; 76]\) for OBR/UBR or \([76,38]\) for UBR encoding (order is irrelevant).

The proprietary variation of the ACS2 system was named rACS [127] and can be distinguished from the native implementation by:

  • Don’t care symbol - In rACS the feature attributes consists solely of interval ranges. The “don’t care” and “pass-through” wildcard symbols are represented as a full-ranged interval (e.g. using 4 bit encoding - \(\textrm{UBR}(0, 15)\) or \(\textrm{UBR}(15, 0)\)).

  • Covering - The covering process introduces randomness when a new classifier is added to the population. A new parameter - covering noise \(\epsilon_{cover}\) defines the maximum noise that can alter current perception. The noise \(\epsilon\) is drawn from uniform random distribution \(U[0, \epsilon_{cover}]\). When creating a new classifier each condition and effect attribute is spread \(\textrm{UBR}(x_1 - \epsilon, x_2 + \epsilon)\) accordingly.

  • Mutation - Similarly, a new parameter - mutation noise \(\epsilon_{mutation}\) is used for introducing slight disturbances. For each attribute of condition and effect perception string a noise \(\epsilon\) is drawn from uniform distribution \(U[-\epsilon_{mutation}, \epsilon_{mutation}\)] and added to the current value.

  • Subsumption - The mechanism was extended accordingly to analyze incorporating ranges.

  • Marking - Classifier’s mark stores only single encoded exceptional perceptions (not intervals).

Experimental evaluation

This section presents the motivation, goals and setup of the performed experiments and their results.

Research questions

The conducted research aims to answer the following question regarding the rACS algorithm and the usage of interval-based representation

  1. Can the rACS algorithm form the internal model of the environment and exploit it successfully?

Goals of the experiments

Experiment 1 - Encoding precision

The impact of using different numbers of bits for creating UBR ranges will be contrasted with the ability to exploit the single-step Real Multiplexer environment. Due to the environment’s increasing complexity, a simple 3-bit variant is sufficient to demonstrate the potential pitfalls of the proposed interval-based representation

Experiment 2 - Nature of the intervals

The main goal of the experiment is to investigate the nature and the evolution of condition intervals. An experiment using the Checkerboard environment will highlight significant differences between conditional attributes, but also the overall performance in this environment will be analyzed.

Answers to research questions

The answers to the previously formulated research questions are as follows:

Q1: Can the rACS algorithm form the internal model of the environment and exploit it successfully?

The performed experiments involved two single-step problems with varying difficulties. The rACS implementation showed promising results, but revealed certain limitations were revealed in both of them.

Rule compaction

Due to the nature of LCS systems, most of the components interact sequentially, frequently iterating over the population of classifiers. rACS agents showed to be biased towards creating rules capturing specific small niches of the solution space, therefore significantly increasing population size and making it more indecipherable. The enhancement merging rules affecting neighbouring areas of solution space would have a heavy impact on the solution’s computational performance and compactness.

Encoding resolution

The proposed approach represented interval boundaries as integer values. While this decision has a positive impact mainly on the simplification of internal operators, there are certain drawbacks. First and foremost, there is a tradeoff between selected resolution and the size of the population. In all performed experiments, situations with high resolution yield the most significant internal models containing very specific classifiers. Second, knowledge about environment internal regularities (like the rRMPX threshold) is beneficial for determining optimal encoding value because some of the regularities might not be captured due to perceptional ambiguity (see Fig. 3.1).

Effect part

The majority of rACS modifications focus on building the correct condition part, ignoring the evolution of the effect side. The ALP component responsible for driving the agent’s learning mechanism was not intended for real-valued representation. Moreover, the problem with the growing population is also related to the creation of duplicated and overlapping classifiers where the aforementioned processes cannot perform subsumption and merging of classifiers with similar rule structure while favouring more general classifiers. Therefore, in rACS, its usage was simplified, indicating a need for a dedicated solution.