Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
ce230db
Class Point / K-means algorithm
May 5, 2021
0b33eac
number of clusters in parameters / test on datasets
May 6, 2021
8b89af6
radon cc in k_means_init / pylinting
May 11, 2021
4c7d872
Default constructor, addition between two points, multiplication by s…
May 25, 2021
ee14622
parrallel optimization in k_means_init
May 25, 2021
057457c
fix: init instead of from_seq
May 26, 2021
87000f9
Division of a point
May 26, 2021
0d9b023
parallel optimization, assign_cluster and update_cluster
May 26, 2021
da4a4d6
pylinting, typing
May 27, 2021
9f1e0fc
FIX: bad list initialization parallel list
May 27, 2021
a8d0385
Changing sample type from custom type "Point" to Tuple
Ak-iu May 27, 2021
63f5991
Merge remote-tracking branch 'origin/master'
Ak-iu May 27, 2021
c617aad
Add point_interface and changing the class Point to Point_2D
Ak-iu May 28, 2021
1ca4503
Merge remote-tracking branch 'origin/master'
Ak-iu May 28, 2021
ff39d0b
FIX: input_list type form Tuple to Point_2D
Ak-iu May 28, 2021
32f557e
Add class point_3D.py
Ak-iu May 28, 2021
7e6966d
rand_point_2D_list / rand_point_3D_list
Ak-iu May 28, 2021
9b147eb
Point_3D update
Ak-iu May 28, 2021
5b89f49
Typing Point_2D -> Point_Interface
Ak-iu May 28, 2021
eb792d5
optimization update_centroids
May 28, 2021
a9327a7
Merge remote-tracking branch 'origin/master'
May 28, 2021
f921c83
Merge remote-tracking branch 'clement/master'
May 28, 2021
467b33b
refactoring because of new point implementation
May 28, 2021
8c2cf82
use of parallelism random choice first centroid
May 31, 2021
27a5039
add point dimensions in k-means-main's options
Jun 1, 2021
84f2daa
interface convention
Jun 2, 2021
82b7a7d
parallel optimization update_centroids
Jun 4, 2021
08a4dd6
adding option to show clusters graph of 2D points
Jun 4, 2021
f6f46cd
k-means clustering documentation
Jun 7, 2021
42c7506
3d representation for Point_3D clusters
Ak-iu Jun 8, 2021
810c54b
error subtraction in distance
Jun 8, 2021
5cba1e9
adding colors 3D graph result, fix warning matplotlib
Jun 8, 2021
eb16d4c
adding Point Interface section
Jun 8, 2021
529498e
change show-clusters display message
Jun 8, 2021
34ff5ce
array module
Jun 15, 2021
d072ac2
init, str method
Jun 16, 2021
c0203ad
array hello_world
Jun 16, 2021
1779397
allgather for distribution
Jun 17, 2021
42fc4fc
distribution lines to colums
Jun 17, 2021
2acc26a
use of enum for distribution choice
Jun 21, 2021
3d21917
changes in local_index
Jun 21, 2021
724b88f
callable init function with line and column parameters
Jun 21, 2021
84cc587
init column and line distribution
Jun 21, 2021
efcd451
map function, merge of init function, generic type
Jun 22, 2021
f318b51
reduce function
Jun 22, 2021
8282bef
array interface
Jun 23, 2021
0cec3c4
sarray2d class, changes parray2d content with sarray2d
Jun 23, 2021
07fd647
array get_partition
Jun 24, 2021
aea6063
doctest, docstring array interface
Jun 24, 2021
d2717a1
map2 skeleton
Jun 25, 2021
2248184
adding to_seq skeleton
Jun 28, 2021
91a8978
new doctests with to_seq
Jun 28, 2021
e0e02a0
correction doctest get_partition
Jun 28, 2021
b51b9d3
column to line distribution
Jun 28, 2021
afc28a0
bad signature distribute
Jun 29, 2021
543bf9c
distribute signature correction
Jun 30, 2021
c16c720
missing parameter doctest distribute
Jun 30, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 65 additions & 1 deletion docs/api.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,66 @@
PySke API
=========
=========

Pyske API offer applications implemented with list and tree skeletons.
The user can use the sequential or parallel version.
The parallel version allows a faster execution time when its launched on several processors, cores or computers.

Dot Product
-----------

Discrete Fast Fourier Transform
-------------------------------

K-means Clustering
------------------

K-means clustering is an unsupervised algorithm that aims to partition group of points in k clusters.

K-means function
^^^^^^^^^^^^^^^^

.. py:module:: pyske.examples.list.k_means

.. autofunction:: k_means

Initialization functions
^^^^^^^^^^^^^^^^^^^^^^^^

This is the standard method that initializes the centroids. This method chooses the centroids in order that each point is as far as possible from the other.

.. autofunction:: k_means_init


Point Interface
^^^^^^^^^^^^^^^

K-means algorithm takes a list of points in parameters. For now two versions implement this class, one for 2 dimension points and another for 3 dimension points.

Point 2D class implementation:

.. autoclass:: pyske.core.util.point_2D.Point_2D
:members:
:special-members:
:member-order: bysource

Running Example
^^^^^^^^^^^^^^^^^^^^

.. argparse::
:module: pyske.examples.list.util
:func: k_means_parser
:prog: python3 k_means_main.py


Maximum Prefix Sum
------------------

Maximum Segment Sum
-------------------

Parallel Regular Sampling Sort
------------------------------

Variance Example
----------------

10 changes: 6 additions & 4 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))
import os
import sys
sys.path.insert(0, os.path.abspath('../.'))


# -- Project information -----------------------------------------------------
Expand All @@ -31,6 +31,8 @@
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
"sphinx.ext.autodoc",
"sphinxarg.ext"
]

# Add any paths that contain templates here, relative to this directory.
Expand All @@ -52,4 +54,4 @@
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
html_static_path = ['_static']
Empty file added pyske/core/array/__init__.py
Empty file.
211 changes: 211 additions & 0 deletions pyske/core/array/array_interface.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,211 @@
"""
Interface for PySke array.

Interfaces: Array2D.
"""

from abc import ABC, abstractmethod
from enum import Enum
from typing import Callable, Generic, TypeVar, Optional

# pylint: disable=unused-import
from pyske.core.interface import List
from pyske.core.support import parallel as parimpl

T = TypeVar('T') # pylint: disable=invalid-name
U = TypeVar('U') # pylint: disable=invalid-name
V = TypeVar('V') # pylint: disable=invalid-name

_PID: int = parimpl.PID
_NPROCS: int = parimpl.NPROCS
_COMM = parimpl.COMM

class Distribution(Enum):
LINE = 'LINE'
COLUMN = 'COLUMN'

class Array2D(ABC, Generic[T]):
"""
PySke array2d (interface)

Static methods:
init.

Methods:
map, reduce, distribute,
get_partition.
"""

@abstractmethod
def __init__(self: 'Array2D[T]') -> None:
"""
Return an empty array.
"""

@staticmethod
@abstractmethod
def init(value_at: Callable[[int, int], V], distribution: Distribution, col_size: int,
line_size: int) -> 'Array2D[V]':
"""
Return an array built using a function

Example::

>>> from pyske.core.array.sarray2d import SArray2D
>>> from pyske.core.array.array_interface import Distribution
>>> number_line = 2
>>> number_column = 2
>>> init_function = lambda line, column: line * number_column + column
>>> SArray2D.init(init_function, Distribution.LINE, number_column, number_line)
( 0 1 )
( 2 3 )

:param value_at: binary function
:param distribution: the distribution direction (LINE, COLUMN)
:param col_size: number of columns
:param line_size: number of lines
:return: an 2d array of the given line and column size, where for all valid line column
i, j, the value at this index is value_at(i, j)
"""

@abstractmethod
def distribute(self: 'Array2D[T]', distribution_direction: Distribution) -> 'Array2D[T]':
"""
Copy the array while changing its distribution.

In sequential, it just returns ``self``. In parallel, communications
are performed to meet line or column distribution.

Examples::

>>> from pyske.core.array.sarray2d import SArray2D
>>> from pyske.core.array.array_interface import Distribution
>>> sarray2d = SArray2D.init(lambda i, j: 1, Distribution.LINE, col_size=2, line_size=2)
>>> sarray2d.distribute(Distribution.COLUMN)
( 1 1 )
( 1 1 )

:param distribution_direction: the distribution direction (LINE, COLUMN)
:return: an array containing the same elements.
"""

@abstractmethod
def map(self: 'Array2D[T]', unary_op: Callable[[T], V]) -> 'Array2D[V]':
"""
Apply a function to all the elements.

The returned array has the same shape (same size, same distribution)
than the initial array.

Examples::

>>> from pyske.core.array.sarray2d import SArray2D
>>> from pyske.core.array.parray2d import PArray2D
>>> from pyske.core.array.array_interface import Distribution
>>> col_size = 2
>>> line_size = 2
>>> SArray2D.init(lambda i, j: 1, Distribution.LINE, col_size, line_size).map(lambda x: x + 1)
( 2 2 )
( 2 2 )
>>> parray2d = PArray2D.init(lambda i, j: 1, Distribution.LINE, col_size=2, line_size=2).map(lambda x: x + 1)
>>> parray2d.to_seq()
( 2 2 )
( 2 2 )

:param unary_op: function to apply to elements
:return: a new array
"""

@abstractmethod
def reduce(self: 'Array2D[T]', binary_op: Callable[[T, T], T],
neutral: Optional[T] = None) -> T:
"""
Reduce an array of value to one value.

Examples::

>>> from pyske.core.array.sarray2d import SArray2D
>>> from pyske.core.array.parray2d import PArray2D
>>> from pyske.core.array.array_interface import Distribution
>>> parray2d = PArray2D.init(lambda i, j: 1, Distribution.COLUMN, col_size=2, line_size=2)
>>> parray2d.reduce(lambda x, y: x + y)
4
>>> SArray2D().reduce(lambda x, y: x + y, 0)
0

:param binary_op: a binary associative and commutative operation
:param neutral: (optional):
a value that should be a neutral element for the operation,
i.e. for all element e,
``binary_op(neutral, e) == binary_op(e, neutral) == e``.
If this argument is omitted the list should not be empty.
:return: a value
"""

@abstractmethod
def get_partition(self: 'Array2D[T]') -> 'List[Array2D[T]]':
"""
Make the distribution visible.

Examples::

>>> from pyske.core.array.sarray2d import SArray2D
>>> from pyske.core.array.parray2d import PArray2D
>>> from pyske.core.array.array_interface import Distribution
>>> from pyske.core.util import par
>>> col_size = 2
>>> line_size = 2
>>> init_function = lambda line, column: line * col_size + column
>>> SArray2D.init(init_function, Distribution.LINE, col_size, line_size).get_partition()
[( 0 1 )
( 2 3 )]
>>> parray2d = PArray2D.init(init_function, Distribution.LINE, col_size=2, line_size=2)
>>> parray2d.get_partition().to_seq() if par.procs() == [0, 1] else '[( 0 1 ), ( 2 3 )]'
'[( 0 1 ), ( 2 3 )]'

:return: a list of array.
"""

@abstractmethod
def map2(self: 'Array2D[T]', binary_op: Callable[[T, U], V],
a_array: 'Array2D[U]') -> 'Array2D[V]':
"""
Apply a function to all the elements of ``self`` and an array.

The returned array has the same shape (same size, same distribution)
than the initial arrays.

Examples::

>>> from pyske.core.array.sarray2d import SArray2D
>>> from pyske.core.array.array_interface import Distribution
>>> sarray2d = SArray2D.init(lambda line, column: 1, Distribution.LINE, col_size = 2, line_size = 2)
>>> sarray2d.map2(lambda x, y: x + y, sarray2d)
( 2 2 )
( 2 2 )

:param binary_op: function to apply to each pair of elements
:param a_array: the second array.
The second array must have same column and line size than `self`.
:return: a new array.
"""

@abstractmethod
def to_seq(self: 'Array2D[T]') -> 'Array2D[T]':
"""
Return a sequential array with same content.

Examples::

>>> from pyske.core.array.sarray2d import SArray2D
>>> from pyske.core.array.parray2d import PArray2D
>>> from pyske.core.array.array_interface import Distribution
>>> PArray2D.init(lambda i, j: 1, Distribution.COLUMN, col_size=2, line_size=2).to_seq()
( 1 1 )
( 1 1 )
>>> SArray2D.init(lambda line, column: 1, Distribution.LINE, col_size = 2, line_size = 2).to_seq()
( 1 1 )
( 1 1 )

:return: a sequential array.
"""
Loading