RIST

Revue d'Information Scientifique et Technique

Induction de sens des mots Arabes dans un espace vectoriel des mots.

Nous décrivons dans cet article, une nouvelle approche d’induction de sens des mots pour la langue Arabe dans un espace vectoriel des mots. Les modèles de représentation vectorielles suscitent un grand intérêt de la part de la
communauté de recherche TALN. Ces modèles sont fondés sur l’hypothèse distributionnelle qui prend en compte le « contexte » d’un mot cible. Ces modèles mappent tous les mots du vocabulaire à un espace vectoriel et fournissent ensuite une description sémantique des mots d’un corpus en tant que vecteurs numériques. Néanmoins, un problème bien connu de ces modèles est qu’ils ne peuvent pas gérer la polysémie. Nous présentons un nouveau modèle simple qui utilise les word embeddings que nous expérimentons pour la tâche non supervisée de l’induction de sens des mots arabes. Les
modèles sont développés à l’aide des outils GenSim pour SKIP-Gram et CBOW. Le modèle permet ensuite de créer un indexeur basé sur la similarité cosinus en utilisant l’indexeur Annoy, qui est plus rapide que la fonction de similarité de
GenSim. Un ego-network est utilisé pour étudier la structure des relations d’un individu et permet de construire un graphe de mots associés provenant des voisins locaux. Les différents sens des mots sont générés en utilisant du clustering de graphes. Nous avons travaillé avec deux corpus d’information: OSAC et AraCorpus ainsi qu’un modèle de Word Embeddings existant AraVec. Ensuite, nous avons expérimenté les différents modèles pour l’induction du sens des mots et nous avons obtenu des résultats prometteurs.

Auteurs : Djaidri Asma, Aliane Hassina, Azzoune Hamida

Téléchargement : PDF

Impact of Stemming Techniques on Topic Segmentation of Arabic Texts

In this paper, we propose a topic segmentation approach for Arabic texts, through which we have studied the effect of the application of two different stemming techniques, root-based and light stemming. The approach we propose is global,distributional, non-linear. It is global since it considers a comparison of all text segments and not only neighboring segments. It is non-linear in the sense that it can rank segments situated in different positions in text in same groups (subtopics). The approach is based on the calculation of lexical cohesion between segments basing on a combination of repetitive lexical semantic criteria. For terms weighting, we have used OKAPI (BM25) measure after an operation of stemming using both root-based stemming and light stemming. The semantic repetitions of terms are calculated using
Arabic WordNet lexical database. A similarity matrix is created where rows and columns are the text segments and the elements of the matrix are COSINE scores between pairs of segments. Subtopics are finally formed using a strict
clustering technique in order to eliminate redundancy in the segment groups. For experimentation, we tested our system on a collection of economic and web news articles using Recall, Precision, F-measure and WindowDiff. The obtained
results are very promising.

 

Auteurs : Belahcene Bahloul , Hassina Aliane , Mohamed Benmohammed

Téléchargement : PDF

Extraction optimisée de règles d’association

L’exploration de données, connue aussi sous différentes autres appellations dont la fouille de données, prospection de données ou le plus souvent data mining, a pour objet l’extraction de connaissances à partir de données volumineuses.
L’utilisation des techniques de data mining consiste à convertir les données en connaissances appropriées pouvant être utiles à la prise de décision. Dans ce domaine, la recherche des règles d’association est une méthode répandue pour
découvrir des relations ayant un intérêt pour la prise de décision en particulier lorsque les bases de données sont très volumineuses. Néanmoins, le défi majeur consiste à trouver les règles d’association les plus pertinentes. Dans cet article,
nous proposons un framework dont l’objectif est la génération de règles d’association optimisées. Nous considérons trois approches: la première approche applique trois algorithmes de data mining : Apriori, Close et Charm; la second approche complète ces algorithmes en intégrant un algorithme génétique et la dernière approche applique d’abord l’algorithme
génétique sur la base de données, dans un objectif de diversification suivi par des algorithmes de data mining. Notre objectif étant la génération des règles d’association les plus optimales en termes de pertinence, une nouvelle mesure de qualité des règles d’association générées a été proposée. Les résultats d’expérimentation démontrent l’apport de l’intégration des techniques de data mining avec les algorithmes génétiques.

Auteurs : Nour-Eddine Aissaouia

Téléchargement : PDF

Applications of Graph Databases and Big Data Technologies in Healthcare

Several aspects related to big data technologies in the healthcare area, like architecture and capabilities, have been surveyed. Also, many works propose the use of graph databases in healthcare domain. However, according to the best of our knowledge, there is no work that addresses the challenges related to big data technologies and graph databases in healthcare. For this reason, we address a survey of big data in healthcare based on a graph database. The presented paper exposes a gap analysis based on a set of paper related to the healthcare systems based on graph databases and big data technologies.

 

Auteurs : Faiza Deghmani , Idir Amine Amarouche , Kamel Boukhalfa

Téléchargement : PDF

 

Norm Regularization Method for Additive Noise Removal

It is widely acknowledged that image denoising problem has been studied in the areas of image processing. The denoising problem is to recover original image u from the observed image f . In this paper, l 1 and l 2 -norm regularization are
studied, developed and implemented in order to restore images contaminated by additive noise. To solve these two approaches problems, the discretization finite difference method is employed before applying the gradient descent
algorithm to optimize the noised signal. According to experiment results, the two methods are applied to some test images with different level noise then compared by using the quality metrics Signal Noise to Ratio SNR , Peak-Signal-to-Noise-Ratio(PSNR) and Structural Similarity Index(SSIM). Through this study, the algorithm which minimizes l 2 – norm of gradient of image has a unique solution and it’s easy to implement, but it doesn’t accept contour discontinuities, causing the obtained solution to be smooth. The l 2 -norm will blur the edges of the image. In order to preserve sharp edges, l 1 -norm is introduced. So, we can confirm that l 1 regularization encourages image smoothness while allowing for presence of jumps and discontinuities, a key feature for image processing because of the importance of edges in human
vision.

 

Auteurs : Nacira Diffellah , Rabah Hamdini  , Tewfik Bekkouche

Téléchargement : PDF

DM-RPL: Disjoint Multipath RPL for Bandwidth Provision in the Internet of Multimedia Things.

Internet of Multimedia Things (IoMT) is one extensively current topic of the Internet of Things (IoT) due to the immersive growth of multimedia applications in several fields. In Low-power and Lossy Networks (LLNs) where sensor nodes are a key component, providing a satisfactory quality of service (QoS) as well as a user quality of experience (QoE) for such applications is a challenging task. In fact, high bandwidth and substantial computation resources are required. To provide sufficient bandwidth to handle these high data rate applications, we propose to extend RPL to enable for simultaneous use of disjoint multiple paths. This is done on top of the already maintained DODAG structure with the least induced overhead. Furthermore, we suggest applying a low-complexity encoding method on the captured images. Based on both QoS and QoE metrics, we evaluate the performance of our disjoint multipath RPL (DM-RPL) for real video clip transmission using the IoT-LAB testbed. Our results show that multipath provides more bandwidth as the
PDR is increased. Video quality is further improved thanks to the adopted data reduction at the source. All of this translates into less energy being consumed.

Auteurs : Souhila Kettouche  , Moufida Maimour , Lakhdar Derdouri

Téléchargement : PDF

An Efficient Particle Swarm Optimization of RFM for ALSAT2 Images Ortho-rectification

Image ortho-rectification is a standard process in remote sensing for correcting the geometric distortions and relief displacement errors introduced by the payload system during the imaging time. It requires a precise rigorous sensor model or rational function models which are refined using well-distributed ground control points. The Rational function model (RFM) is commonly used because of its simplest model and does not need sensor parameters. Therefore, the RFM terms or also rational polynomial coefficients (RPCs) have no physical significance but depends on many ground control points (GCPs) that make the model prone to the over parameterization problem. The application of meta-heuristic algorithms is suited for RFM optimization. This paper proposes a binary particle swarm optimization BPSO to surmount the issue of
over-parameterization and find the optimum combination of RPCs for the RFM by adding a new transfer function. The algorithm is applied to the ALSAT2 images and the results showed the effectiveness and the accuracy of BPSO over the traditional binary literature methods. Furthermore, a hybrid optimization technique is introduced that blends the BPSO concept by adding the genetic operations such as crossover and mutation in order to increase the convergence speed and avoid the local optimum phenomenon. The proposed method gives a better result than the suggested one.

 

Auteurs : Oussama Mezouar , F. Meskine  , I. Boukerch

 

Téléchargement : PDF

 

 

A Survey on Identity-based Key Management Schemes in Mobile Ad hoc networks

Mobile Ad hoc networks attract more attention over the years, but the security matter of this type of network makes it hard  to achieve all of their advantages. Cryptographic key management is the cornerstone for building any robust network  security solution. Identity-based cryptography is a promising solution that resists well the key escrow problem, which is  suitable for Mobile Ad hoc networks. In this paper, we give an overview of the most important identity-based encryption  schemes proposed in the last decade; combined with other techniques to enhance it and provide better results for Mobile  Ad hoc networks. Hence, we give a comparative analysis to highlight their advantages and weaknesses. This work gives  insights into a recent research to point out its interesting features, take advantages of its strength, ovoid its weaknesses and  to lay out the future directions in this area.

Auteurs : Kenza Gasmi, Abdelhabib Bourouis, Rohallah Benaboud

Téléchargement : PDF

From Data and Information Processing to Knowledge Organization: Architectures, Models and Systems

In this « special issue » on the topic « From Data and Information Processing to Knowledge Organization: Architectures,Models and Systems », seven (07) selected communications have been reviewed by peers in the OCTA Multi-Conference (unifying 4 scientific projects: SIIE, ISKO-Maghreb, CITED and TBMS) in program committees. We consider that this set of proposals, enriched in circumstance of this special issue by its authors at our request, are an excellent engine of current scientific ideas and challenges in the domain concerned in ISKO-Maghreb Society.

 

Auteur : Sahbi Sidhom

Téléchargement : PDF