Citation
L'auteur
Henri Laude
(henri.laude@ar-p.com) - Advanced Research Partners
Copyright
Déclaration d'intérêts
Financements
Aperçu
Contenu
Le programme de référence
Vous avez le choix du langage (Python ou R), des algorithmes et des packages que vous voulez invoquer dans ceux proposés (Tensorflow 2.x, Pytorch …). Pour autant, il vous faut utiliser les mêmes données en entrée (fichier “cora”), effectuer un split 50/50 aléatoire sur les données et la dernière ligne produite par votre programme doit être formatée de la façon suivante :
Accuracy = 79.0%, Time = 30.1 s
Accuracy étant le pourcentage de réussite de prédiction sur le graphe de test, et Time étant le délais entre le début du code de création du modèle et le calcul de l’accuracy sur les données de test.
# Graph attention network (GAT) for node classification (d'après un tutoriel keras.com)
# Author: akensert
# Date created: 2021/09/13
# Last modified: 2021/12/26
# Description: An implementation of a Graph Attention Network (GAT)
# for node classification.
# minor changes for our challenge : H.Laude
# Date : 2022/10/05
# Import packages
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
import pandas as pd
import os
import warnings
import time
warnings.filterwarnings("ignore")
pd.set_option("display.max_columns", 6)
pd.set_option("display.max_rows", 6)
np.random.seed(2)
# chargement data
zip_file = keras.utils.get_file(
fname="cora.tgz",
origin="https://linqs-data.soe.ucsc.edu/public/lbc/cora.tgz",
extract=True,
)
data_dir = os.path.join(os.path.dirname(zip_file), "cora")
citations = pd.read_csv(
os.path.join(data_dir, "cora.cites"),
sep="\t",
header=None,
names=["target", "source"],
)
papers = pd.read_csv(
os.path.join(data_dir, "cora.content"),
sep="\t",
header=None,
names=["paper_id"] + [f"term_{idx}" for idx in range(1433)] + ["subject"],
)
class_values = sorted(papers["subject"].unique())
class_idx = {name: id for id, name in enumerate(class_values)}
paper_idx = {name: idx for idx, name in enumerate(sorted(
papers["paper_id"].unique()))}
papers["paper_id"] = papers["paper_id"].apply(lambda name: paper_idx[name])
citations["source"] = citations["source"].apply(lambda name: paper_idx[name])
citations["target"] = citations["target"].apply(lambda name: paper_idx[name])
papers["subject"] = papers["subject"].apply(lambda value: class_idx[value])
Extrait des “citations” :
print(citations)
target source
0 0 21
1 0 905
2 0 906
... ... ...
5426 1874 2586
5427 1876 1874
5428 1897 2707
[5429 rows x 2 columns]
Extrait des “papiers” :
print(papers)
paper_id term_0 term_1 ... term_1431 term_1432 subject
0 462 0 0 ... 0 0 2
1 1911 0 0 ... 0 0 5
2 2002 0 0 ... 0 0 4
... ... ... ... ... ... ... ...
2705 2372 0 0 ... 0 0 1
2706 955 0 0 ... 0 0 0
2707 376 0 0 ... 0 0 2
[2708 rows x 1435 columns]
Travaux de préparation
# split
# Obtain random indices
random_indices = np.random.permutation(range(papers.shape[0]))
# 50/50 split
train_data = papers.iloc[random_indices[: len(random_indices) // 2]]
test_data = papers.iloc[random_indices[len(random_indices) // 2 :]]
# graph construction
# Obtain paper indices which will be used to gather node states
# from the graph later on when training the model
train_indices = train_data["paper_id"].to_numpy()
test_indices = test_data["paper_id"].to_numpy()
# Obtain ground truth labels corresponding to each paper_id
train_labels = train_data["subject"].to_numpy()
test_labels = test_data["subject"].to_numpy()
# Define graph, namely an edge tensor and a node feature tensor
edges = tf.convert_to_tensor(citations[["target", "source"]])
node_states = tf.convert_to_tensor(papers.sort_values("paper_id").iloc[:, 1:-1])
Vérification des formes des tenseurs en question
# Print shapes of the graph
print("Edges shape:\t\t", edges.shape)
Edges shape: (5429, 2)
print("Node features shape:", node_states.shape)
Node features shape: (2708, 1433)
Début du véritable challenge, avec lancement du timer
# timer
start_time = time.time()
# model -----------------------------------------------------------------------
class GraphAttention(layers.Layer):
def __init__(
self,
units,
kernel_initializer="glorot_uniform",
kernel_regularizer=None,
**kwargs,
):
super().__init__(**kwargs)
self.units = units
self.kernel_initializer = keras.initializers.get(kernel_initializer)
self.kernel_regularizer = keras.regularizers.get(kernel_regularizer)
def build(self, input_shape):
self.kernel = self.add_weight(
shape=(input_shape[0][-1], self.units),
trainable=True,
initializer=self.kernel_initializer,
regularizer=self.kernel_regularizer,
name="kernel",
)
self.kernel_attention = self.add_weight(
shape=(self.units * 2, 1),
trainable=True,
initializer=self.kernel_initializer,
regularizer=self.kernel_regularizer,
name="kernel_attention",
)
self.built = True
def call(self, inputs):
node_states, edges = inputs
# Linearly transform node states
node_states_transformed = tf.matmul(node_states, self.kernel)
# (1) Compute pair-wise attention scores
node_states_expanded = tf.gather(node_states_transformed, edges)
node_states_expanded = tf.reshape(
node_states_expanded, (tf.shape(edges)[0], -1)
)
attention_scores = tf.nn.leaky_relu(
tf.matmul(node_states_expanded, self.kernel_attention)
)
attention_scores = tf.squeeze(attention_scores, -1)
# (2) Normalize attention scores
attention_scores = tf.math.exp(tf.clip_by_value(attention_scores, -2, 2))
attention_scores_sum = tf.math.unsorted_segment_sum(
data=attention_scores,
segment_ids=edges[:, 0],
num_segments=tf.reduce_max(edges[:, 0]) + 1,
)
attention_scores_sum = tf.repeat(
attention_scores_sum, tf.math.bincount(tf.cast(edges[:, 0], "int32"))
)
attention_scores_norm = attention_scores / attention_scores_sum
# (3) Gather node states of neighbors, apply attention scores and aggregate
node_states_neighbors = tf.gather(node_states_transformed, edges[:, 1])
out = tf.math.unsorted_segment_sum(
data=node_states_neighbors * attention_scores_norm[:, tf.newaxis],
segment_ids=edges[:, 0],
num_segments=tf.shape(node_states)[0],
)
return out
class MultiHeadGraphAttention(layers.Layer):
def __init__(self, units, num_heads=8, merge_type="concat", **kwargs):
super().__init__(**kwargs)
self.num_heads = num_heads
self.merge_type = merge_type
self.attention_layers = [GraphAttention(units) for _ in range(num_heads)]
def call(self, inputs):
atom_features, pair_indices = inputs
# Obtain outputs from each attention head
outputs = [
attention_layer([atom_features, pair_indices])
for attention_layer in self.attention_layers
]
# Concatenate or average the node states from each head
if self.merge_type == "concat":
outputs = tf.concat(outputs, axis=-1)
else:
outputs = tf.reduce_mean(tf.stack(outputs, axis=-1), axis=-1)
# Activate and return node states
return tf.nn.relu(outputs)
class GraphAttentionNetwork(keras.Model):
def __init__(
self,
node_states,
edges,
hidden_units,
num_heads,
num_layers,
output_dim,
**kwargs,
):
super().__init__(**kwargs)
self.node_states = node_states
self.edges = edges
self.preprocess = layers.Dense(hidden_units * num_heads, activation="relu")
self.attention_layers = [
MultiHeadGraphAttention(hidden_units, num_heads) for _ in range(num_layers)
]
self.output_layer = layers.Dense(output_dim)
def call(self, inputs):
node_states, edges = inputs
x = self.preprocess(node_states)
for attention_layer in self.attention_layers:
x = attention_layer([x, edges]) + x
outputs = self.output_layer(x)
return outputs
def train_step(self, data):
indices, labels = data
with tf.GradientTape() as tape:
# Forward pass
outputs = self([self.node_states, self.edges])
# Compute loss
loss = self.compiled_loss(labels, tf.gather(outputs, indices))
# Compute gradients
grads = tape.gradient(loss, self.trainable_weights)
# Apply gradients (update weights)
optimizer.apply_gradients(zip(grads, self.trainable_weights))
# Update metric(s)
self.compiled_metrics.update_state(labels, tf.gather(outputs, indices))
return {m.name: m.result() for m in self.metrics}
def predict_step(self, data):
indices = data
# Forward pass
outputs = self([self.node_states, self.edges])
# Compute probabilities
return tf.nn.softmax(tf.gather(outputs, indices))
def test_step(self, data):
indices, labels = data
# Forward pass
outputs = self([self.node_states, self.edges])
# Compute loss
loss = self.compiled_loss(labels, tf.gather(outputs, indices))
# Update metric(s)
self.compiled_metrics.update_state(labels, tf.gather(outputs, indices))
return {m.name: m.result() for m in self.metrics}
# train
# Define hyper-parameters
HIDDEN_UNITS = 100
NUM_HEADS = 8
NUM_LAYERS = 3
OUTPUT_DIM = len(class_values)
NUM_EPOCHS = 100
BATCH_SIZE = 256
VALIDATION_SPLIT = 0.1
LEARNING_RATE = 3e-1
MOMENTUM = 0.9
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
optimizer = keras.optimizers.SGD(LEARNING_RATE, momentum=MOMENTUM)
accuracy_fn = keras.metrics.SparseCategoricalAccuracy(name="acc")
early_stopping = keras.callbacks.EarlyStopping(
monitor="val_acc", min_delta=1e-5, patience=5, restore_best_weights=True
)
# Build model
gat_model = GraphAttentionNetwork(
node_states, edges, HIDDEN_UNITS, NUM_HEADS, NUM_LAYERS, OUTPUT_DIM
)
# Compile model
gat_model.compile(loss=loss_fn, optimizer=optimizer, metrics=[accuracy_fn])
gat_model.fit(
x=train_indices,
y=train_labels,
validation_split=VALIDATION_SPLIT,
batch_size=BATCH_SIZE,
epochs=NUM_EPOCHS,
callbacks=[early_stopping],
verbose=2,
)
Epoch 1/100
5/5 - 8s - loss: 1.8487 - acc: 0.2783 - val_loss: 1.5296 - val_acc: 0.4485
Epoch 2/100
5/5 - 0s - loss: 1.2256 - acc: 0.5829 - val_loss: 1.0110 - val_acc: 0.7279
Epoch 3/100
5/5 - 0s - loss: 0.6838 - acc: 0.8120 - val_loss: 0.7001 - val_acc: 0.8015
Epoch 4/100
5/5 - 0s - loss: 0.3986 - acc: 0.8842 - val_loss: 0.7902 - val_acc: 0.7941
Epoch 5/100
5/5 - 0s - loss: 0.2429 - acc: 0.9343 - val_loss: 0.7304 - val_acc: 0.8162
Epoch 6/100
5/5 - 0s - loss: 0.1470 - acc: 0.9655 - val_loss: 0.8260 - val_acc: 0.8015
Epoch 7/100
5/5 - 0s - loss: 0.0896 - acc: 0.9852 - val_loss: 0.7618 - val_acc: 0.8235
Epoch 8/100
5/5 - 0s - loss: 0.0566 - acc: 0.9959 - val_loss: 0.6827 - val_acc: 0.8235
Epoch 9/100
5/5 - 0s - loss: 0.0403 - acc: 0.9951 - val_loss: 0.7350 - val_acc: 0.8456
Epoch 10/100
5/5 - 0s - loss: 0.0313 - acc: 0.9959 - val_loss: 0.7944 - val_acc: 0.8235
Epoch 11/100
5/5 - 0s - loss: 0.0275 - acc: 0.9975 - val_loss: 0.8226 - val_acc: 0.8162
Epoch 12/100
5/5 - 0s - loss: 0.0221 - acc: 0.9975 - val_loss: 0.8895 - val_acc: 0.8235
Epoch 13/100
5/5 - 0s - loss: 0.0168 - acc: 0.9992 - val_loss: 0.8565 - val_acc: 0.8309
Epoch 14/100
5/5 - 0s - loss: 0.0125 - acc: 0.9992 - val_loss: 0.8511 - val_acc: 0.8309
<keras.callbacks.History object at 0x000000006A46BD68>
_, test_accuracy = gat_model.evaluate(x=test_indices, y=test_labels, verbose=0)
# prediction ------------------------------------------------------------------
test_probs = gat_model.predict(x=test_indices)
mapping = {v: k for (k, v) in class_idx.items()}
# FIN MESURE TEMPS APRES AVOIR EFFECTUE DES PREDICTIONS SUR L'ENSEMBLE
# DES DONNEES DE TEST
interval = time.time() - start_time
Le travail est effectué et nous avons à la fois collecté l’accuracy et le temps de traitement pour créer, entraîner et tester notre modèle.
Pour prouver notre capacité à utiliser ce modèle, nous allons effectuer 10 prédictions.
# 10 predictions
for i, (probs, label) in enumerate(zip(test_probs[:10], test_labels[:10])):
print(f"Example {i+1}: {mapping[label]}")
for j, c in zip(probs, class_idx.keys()):
print(f"\tProbability of {c: <24} = {j*100:7.3f}%")
print("---" * 20)
# end -------------------------------------------------------------------------
Example 1: Probabilistic_Methods
Probability of Case_Based = 0.994%
Probability of Genetic_Algorithms = 0.052%
Probability of Neural_Networks = 10.639%
Probability of Probabilistic_Methods = 87.432%
Probability of Reinforcement_Learning = 0.188%
Probability of Rule_Learning = 0.007%
Probability of Theory = 0.689%
------------------------------------------------------------
Example 2: Genetic_Algorithms
Probability of Case_Based = 0.000%
Probability of Genetic_Algorithms = 100.000%
Probability of Neural_Networks = 0.000%
Probability of Probabilistic_Methods = 0.000%
Probability of Reinforcement_Learning = 0.000%
Probability of Rule_Learning = 0.000%
Probability of Theory = 0.000%
------------------------------------------------------------
Example 3: Theory
Probability of Case_Based = 4.472%
Probability of Genetic_Algorithms = 0.190%
Probability of Neural_Networks = 0.019%
Probability of Probabilistic_Methods = 16.739%
Probability of Reinforcement_Learning = 0.444%
Probability of Rule_Learning = 1.738%
Probability of Theory = 76.398%
------------------------------------------------------------
Example 4: Neural_Networks
Probability of Case_Based = 0.000%
Probability of Genetic_Algorithms = 0.000%
Probability of Neural_Networks = 99.963%
Probability of Probabilistic_Methods = 0.031%
Probability of Reinforcement_Learning = 0.000%
Probability of Rule_Learning = 0.000%
Probability of Theory = 0.005%
------------------------------------------------------------
Example 5: Theory
Probability of Case_Based = 12.009%
Probability of Genetic_Algorithms = 7.963%
Probability of Neural_Networks = 3.157%
Probability of Probabilistic_Methods = 27.334%
Probability of Reinforcement_Learning = 0.657%
Probability of Rule_Learning = 30.371%
Probability of Theory = 18.510%
------------------------------------------------------------
Example 6: Genetic_Algorithms
Probability of Case_Based = 0.000%
Probability of Genetic_Algorithms = 100.000%
Probability of Neural_Networks = 0.000%
Probability of Probabilistic_Methods = 0.000%
Probability of Reinforcement_Learning = 0.000%
Probability of Rule_Learning = 0.000%
Probability of Theory = 0.000%
------------------------------------------------------------
Example 7: Neural_Networks
Probability of Case_Based = 0.005%
Probability of Genetic_Algorithms = 0.007%
Probability of Neural_Networks = 98.378%
Probability of Probabilistic_Methods = 1.453%
Probability of Reinforcement_Learning = 0.000%
Probability of Rule_Learning = 0.002%
Probability of Theory = 0.156%
------------------------------------------------------------
Example 8: Genetic_Algorithms
Probability of Case_Based = 0.000%
Probability of Genetic_Algorithms = 100.000%
Probability of Neural_Networks = 0.000%
Probability of Probabilistic_Methods = 0.000%
Probability of Reinforcement_Learning = 0.000%
Probability of Rule_Learning = 0.000%
Probability of Theory = 0.000%
------------------------------------------------------------
Example 9: Theory
Probability of Case_Based = 1.714%
Probability of Genetic_Algorithms = 1.794%
Probability of Neural_Networks = 19.014%
Probability of Probabilistic_Methods = 72.576%
Probability of Reinforcement_Learning = 0.905%
Probability of Rule_Learning = 0.504%
Probability of Theory = 3.493%
------------------------------------------------------------
Example 10: Case_Based
Probability of Case_Based = 99.873%
Probability of Genetic_Algorithms = 0.001%
Probability of Neural_Networks = 0.001%
Probability of Probabilistic_Methods = 0.085%
Probability of Reinforcement_Learning = 0.001%
Probability of Rule_Learning = 0.030%
Probability of Theory = 0.009%
------------------------------------------------------------
Voici la ligne que vous devez produire avec vos propres valeurs. Merci de respecter scrupuleusement la formulation afin que nous puissions la traiter automatiquement (n’oubliez pas que nous allons utiliser votre programme 10 fois et effectuer la moyenne de ces 10 passages).
print("--" *20 +f"\nAccuracy = {test_accuracy*100:.1f}%, Time = {interval:.1f} s")
----------------------------------------
Accuracy = 78.7%, Time = 15.2 s
il ne peut pas avoir d'altmétriques.)