Семён Григорьев

Биография

Профессиональная деятельность

Лексический и синтаксический анализ, реинжиниринг ПО, параллельные и асинхронные вычисления.

Более подробная информация:

Научное руководство

  • Анастасия Рагозина
  • Андрей Иванов
  • Артём Горохов
  • Екатерина Вербицкая
  • Марат Хабибуллин
  • Марина Полубелова
  • Рустам Азимов

Проекты

Публикации

  • Ciro Medeiros, Umberto Costa, Semyon Grigorev, Martin A. Musicante
    Regular expressions are used in SPARQL property paths to query RDF graphs. However, regular expressions can only define the most limited class of languages, called regular languages. Context-free languages are a wider class containing all regular languages. There are no context-free expressions to define them, so it is necessary to write grammars. We propose an extension of regular expressions, called recursive expressions, to support the definition of a subset of context-free languages. The goal of our work is therefore to provide simple operators allowing the definition of languages as close as possible to context-free languages.
    ADBIS, TPDL and EDA 2020 Common Workshops and Doctoral Consortium, Август 2020
  • Egor Orachev, Ilya Epelbaum, Rustam Azimov, Semyon Grigorev

    Context-free path queries (CFPQ) extend the regular path queries (RPQ) by allowing context-free grammars to be used as constraints for paths. Algorithms for CFPQ are actively developed, but J. Kuijpers et al. have recently concluded, that existing algorithms are not performant enough to be used in real-world applications. Thus the development of new algorithms for CFPQ is justified. In this paper, we provide a new CFPQ algorithm which is based on such linear algebra operations as Kronecker product and transitive closure and handles grammars presented as recursive state machines. Thus, the proposed algorithm can be implemented by using high-performance libraries and modern parallel hardware. Moreover, it avoids grammar growth which provides the possibility for queries optimization.

    ADBIS 2020. Advances in Databases and Information Systems. Lecture Notes in Computer Science., Август 2020
  • Susanina Y.A., Yaveyn A.N., Grigorev S.V.

    В данной работе предложен алгоритм, который является модификацией алгоритма Валианта. Его основным достоинством является возможность разбиения матрицы разбора на подслои непересекающихся подматриц, которые могут быть обработаны независимо. Доказана корректность и приведена оценка сложности предложенного алгоритма. Проведенные эксперименты показывают, что он сохранил основные преимущества исходного алгоритма, главное из которых – высокая производительность, полученная за счет использования эффективных методов перемножения матриц. Также предложенный алгоритм позволил заметно уменьшить время, затрачиваемое на поиск подстрок, сократив большое количество избыточных вычислений.

    Proceedings of the Institute for System Programming, Июнь 2020
  • Arseniy Terekhov, Artyom Khoroshev, Rustam Azimov, Semyon Grigorev

    A recent study showed that the applicability of context-free path querying (CFPQ) algorithms with relational query semantics integrated with graph databases is limited because of low performance and high memory consumption of existing solutions. In this work, we implement a matrix-based CFPQ algorithm by using appropriate high-performance libraries for linear algebra and integrate it with RedisGraph graph database. Also, we introduce a new CFPQ algorithm with single-path query semantics that allows us to extract one found path for each pair of nodes. Finally, we provide the evaluation of our algorithms for both semantics which shows that matrix-based CFPQ implementation for Redis-Graph database is performant enough for real-world data analysis.

    GRADES-NDA'20: Proceedings of the 3rd Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), Июнь 2020
  • Aleksey Tyurin, Daniil Berezun, Semyon Grigorev

    While GPU utilization allows one to speed up computations to the orders of magnitude, memory management remains the bottleneck making it often a challenge to achieve the desired performance. Hence, different memory optimizations are leveraged to make memory being used more effectively. We propose an approach automating memory management utilizing partial evaluation, a program transformation technique that enables data accesses to be pre-computed, optimized, and embedded into the code, saving memory transactions. An empirical evaluation of our approach shows that the transformed program could be up to 8 times as efficient as the original one in the case of CUDA C naïve string pattern matching algorithm implementation.

    PPoPP '20: Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Февраль 2020
  • R. Azimov and S. Grigorev

    Path querying with conjunctive grammars is known to be undecidable. There is an algorithm for path querying with linear conjunctive grammars which provides an over-approximation of the result, but there is no algorithm for arbitrary conjunctive grammars. We propose the first algorithm for path querying with arbitrary conjunctive grammars. The proposed algorithm is matrix-based and allows us to efficiently apply GPGPU computing techniques and other optimizations for matrix operations.

    Programming and Computer Software, Декабрь 2019
  • Semyon Grigorev and Polina Lunina
    BMC Bioinformatics, Ноябрь 2019
  • Nikita Mishin, Iaroslav Sokolov, Egor Spirin, Vladimir Kutuev, Egor Nemchinov, Sergey Gorbatyuk, and Semyon Grigorev

    Recently proposed matrix multiplication based algorithm for context-free path querying (CFPQ) offloads the most performance-critical parts onto boolean matrices multiplication. Thus, it is possible to achieve high performance of CFPQ by means of modern parallel hardware and software. In this paper, we provide results of empirical performance comparison of different implementations of this algorithm on both real-world data and synthetic data for the worst cases.

    Proceedings of the 2nd Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), Июнь 2019
  • Sergey Bozhko, Leyla Khatbullina, Semyon Grigorev

    The Bar-Hillel theorem states that context-free languages are closed under intersection with a regular set. This theorem has a constructive proof and thus provides a formal justification of correctness of the algorithms for applications mentioned above. Mechanization of the Bar-Hillel theorem, therefore, is both a fundamental result of formal language theory and a basis for the certified implementation of the algorithms for applications. In this work, we present the mechanized proof of the Bar-Hillel theorem in Coq.

    Logic, Language, Information, and Computation, Июнь 2019
  • Semyon Grigorev and Polina Lunina

    We propose a way to combine formal grammars and artificial neural networks for biological sequences processing. Formal grammars encode the secondary structure of the sequence and neural networks deal with mutations and noise. In contrast to the classical way, when probabilistic grammars are used for secondary structure modeling, we propose to use arbitrary (not probabilistic) grammars which simplifies grammar creation. Instead of modeling the structure of the whole sequence, we create a grammar which only describes features of the secondary structure. Then we use matrix-based parsing to extract features: the fact that some substring can be derived from some nonterminal is a feature. After that, we use a dense neural network to process features.

    Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies - BIOINFORMATICS, Март 2019
  • Shemetova E.N., Grigorev S.V.

    Одной из основных задач, связанных с графовыми моделями, является поиск специфичных путей в графе. Естественным способом задать ограничения на пути являются формальные грамматики над метками рёбер графа, при этом запрос к графу может быть представлен в виде множества всех троек (A,v1,v2), для которых существует путь в графе от вершины v1 до вершины v2 такой, что метки на ребрах этого пути образуют строку, выводимую из нетерминала A в данной грамматике. В данной работе исследуются Булевы грамматики. Известно, что задача выполнения запросов к графу с использованием булевых грамматик является неразрешимой. В данной работе предложен приближённый алгоритм поиска путей в ориентированных графах без циклов с ограничениями, заданными с помощью булевых грамматик. Благодаря ограничению на тип анализируемых графов, предложенный алгоритм является более асимптотически оптимальным, чем наивный итерационный алгоритм.

    Proceedings of the Institute for System Programming, Январь 2019
  • Ekaterina Verbitskaia, Ilya Kirillov, Ilya Nozkin, Semyon Grigorev

    Transparent integration of a domain-specific language for specification of context-free path queries (CFPQs) into a general-purpose programming language as well as static checking of errors in queries may greatly simplify the development of applications using CFPQs. LINQ and ORM can be used for the integration, but they have issues with flexibility: query decomposition and reusing of subqueries are a challenge. Adaptation of parser combinators technique for paths querying may solve these problems. Conventional parser combinators process linear input, and only the Trails library is known to apply this technique for path querying. We demonstrate that it is possible to create general parser combinators for CFPQ which support arbitrary context-free grammars and arbitrary input graphs. We implement a library of such parser combinators and show that it is applicable for realistic tasks.

    Proceedings of the 9th ACM SIGPLAN International Symposium on Scala, Сентябрь 2018
  • Kirill Smirenko, Semyon Grigorev

    Extended abstract at TyDe 2018 (at ICFP).

    Сентябрь 2018
  • Rustam Azimov, Semyon Grigorev
    GRADES-NDA '18 Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), Июнь 2018
  • Semyon Grigorev, Anastasiya Ragozina

    There are several solutions for CFPQ, but how to provide structural representation of query result which is practical for answer processing and debugging is still an open problem. In this paper we propose a graph parsing technique which allows one to build such representation with respect to given grammar in polynomial time and space for arbitrary context-free grammar and graph. Proposed algorithm is based on generalized LL parsing algorithm, while previous solutions are based mostly on CYK or Earley algorithms, which reduces time complexity in some cases.

    Proceedings of the 13th Central & Eastern European Software Engineering Conference in Russia (CEE-SECR '17), Декабрь 2017
  • Rustam Azimov, Semyon Grigorev
    arXiv, Июль 2017
  • Marina Polubelova, Sergey Bozhko, Semyon Grigorev
    Proceedings of the Institute for System Programming, Август 2016
  • Ekaterina Verbitskaia , Semyon Grigorev, Dmitry Avdyukhin

    We present a technique for syntax analysis of a regular set of input strings. This problem is relevant for the analysis of string-embedded languages when a host program generates clauses of embedded language at run time. Our technique is based on a generalization of RNGLR algorithm, which, inherently, allows us to construct a finite representation of parse forest for regularly approximated set of input strings. This representation can be further utilized for semantic analysis and transformations in the context of reengineering, code maintenance, program understanding etc. The approach in question implements relaxed parsing: non-recognized strings in approximation set are ignored with no error detection.

    Perspectives of System Informatics, Июнь 2016
  • Marina Polubelova, Semyon Grigorev
    Systems and Means of Informatics, 2016
  • Ekaterina Verbitskaia, Semyon Grigorev and Dmitry Avdyukhin
    Proceedings of 10th International Andrei Ershov Memorial Conference on Perspectives of System Informatics, 2015
  • Marat Khabibullin, Andrei Ivanov, Semyon Grigorev
    Proceedings of the 11th Central & Eastern European Software Engineering Conference in Russia, 2015
  • Ragozina Anastasiya, Grigorev Semyon
    Systems and Means of Informatics, 2015
  • Semen Grigorev, Ekaterina Verbitskaia, Andrei Ivanov, Marina Polubelova, and Ekaterina Mavchun
    Proceedings of the 10th Central and Eastern European Software Engineering Conference in Russia 2014, 2014
  • Semen Grigorev and Iakov Kirilenko
    Proceedings of the 9th Central & Eastern European Software Engineering Conference in Russia, 2013