Skip to content
This repository has been archived by the owner on Jun 15, 2023. It is now read-only.
/ MHA-FS Public archive

The biggest module developed with complete focus on Feature Selection (FS) using Meta-Heuristic Algorithm / Nature-inspired evolutionary / Swarm-based computing.

License

Notifications You must be signed in to change notification settings

thieu1995/MHA-FS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Notification

The biggest open-source Python library for feature selection has been successfully completed and is now available on GitHub at https://github.com/thieu1995/mafese.

MAFESE is an improved and completed version of this module. It not only includes all meta-heuristic algorithms, but it also categorizes them into different feature selection methods. These methods include traditional techniques such as filter-based, wrapper-based, unsupervised-based, and embedded-based methods. Among them, metaheuristic-based feature selection belongs to the filter-based method.

Feature Selection using Meta-Heuristic Algorithms (MEALPY)

Dependencies

  1. Mealpy (https://github.com/thieu1995/mealpy)
  2. Permetrics (https://github.com/thieu1995/permetrics)
  3. Scikit-learn (https://scikit-learn.org/stable/index.html)
  4. Pandas (https://pandas.pydata.org/)
  5. Matplotlib (https://matplotlib.org/)

Setup environments

Pip

pip install numpy, pandas, scikit-learn, matplotlib
pip install mealpy==2.5.0
pip install permetrics==1.3.0

Conda

conda create -n ml python==3.7.5
conda activate ml
conda install -c conda-forge numpy, pandas, scikit-learn, matplotlib
pip install mealpy==2.5.0
pip install permetrics==1.3.0

Examples

Run the file: src/models/mha_fs.py

Notes:


1. The format of solution / position 

- Because we need to select the best features in dataset right?

So a solution is a 1-D vector, each dimension represent an index of column in dataset. 
      If it has value 1, meaning this column is selected for the model
      If it has value 0, meaning this column is not selected for the model

2. The purposes of function: amend_position 

- Because the algorithm create a real-value solution, right?

So we need this function to convert real-value back into integer value (0 and 1). 
      That is why the lower bound is 0 for all dimensions (floor of 0 is 0)
      The upper bound is 1.99 for all dimensions (floor of 1.99 is 1)

Also if all the dimensions have value 0, meaning no column is selected --> Can't run the model 
      So we need to select a random column for the model

3. Fitness function 

- Use the Evaluator object to build a classification model with the selected model (KNN, or Random forrest or SVM) 
and calculate the metrics such as accuracy, precision, recall and f1. 

- Set the obj_weights to 1 if you interested in selected metric 

- In the current example, accuracy metric is selected, so the weight for accuracy is 1 and other metrics are 0. 
Beside, this problem will be maximization problem (we want the maximum accuracy - 1)

4. Use different algorithm

- Replace the selected model in the section: ## 2. Define algorithm and trial

5. Setting

- Everything needs to config is located at: src/config.py 

NoPrintAll

PrintAll

Call all models from MEALPY

https://mealpy.readthedocs.io/en/latest/pages/general/guide_to_use_model.html#import-all-models


from mealpy.bio_based import BBO, EOA, IWO, SBO, SMA, TPO, VCS, WHO
from mealpy.evolutionary_based import CRO, DE, EP, ES, FPA, GA, MA
from mealpy.human_based import BRO, BSO, CA, CHIO, FBIO, GSKA, ICA, LCO, QSA, SARO, SSDO, TLO
from mealpy.math_based import AOA, CGO, GBO, HC, SCA
from mealpy.music_based import HS
from mealpy.physics_based import ArchOA, ASO, EFO, EO, HGSO, MVO, NRO, SA, TWO, WDO
from mealpy.probabilistic_based import CEM
from mealpy.system_based import AEO, GCO, WCA
from mealpy.swarm_based import ABC, ACOR, ALO, AO, BA, BeesA, BES, BFO, BSA, COA, CSA, CSO, DO, EHO, FA, FFA, FOA, GOA, GWO, HGS
from mealpy.swarm_based import HHO, JA, MFO, MRFO, MSA, NMRA, PFA, PSO, SFO, SHO, SLO, SRSR, SSA, SSO, SSpiderA, SSpiderO, WOA


model = BBO.OriginalBBO(problem, epoch=10, pop_size=50)
# model = BBO.BaseBBO(problem, epoch=10, pop_size=50)
# model = EOA.BaseEOA(problem, epoch=10, pop_size=50)
# model = IWO.OriginalIWO(problem, epoch=10, pop_size=50)
# model = SBO.BaseSBO(problem, epoch=10, pop_size=50)
# model = SBO.OriginalSBO(problem, epoch=10, pop_size=50)
# model = SMA.OriginalSMA(problem, epoch=10, pop_size=50)
# model = SMA.BaseSMA(problem, epoch=10, pop_size=50)
# model = VCS.OriginalVCS(problem, epoch=100, pop_size=50)
# model = VCS.BaseVCS(problem, epoch=10, pop_size=50)
# model = WHO.BaseWHO(problem, epoch=10, pop_size=50)

# model = CRO.BaseCRO(problem, epoch=10, pop_size=50)
# model = CRO.OCRO(problem, epoch=10, pop_size=50)
# model = DE.BaseDE(problem, epoch=10, pop_size=50)
# model = DE.JADE(problem, epoch=10, pop_size=50)
# model = DE.SADE(problem, epoch=10, pop_size=50)
# model = DE.SHADE(problem, epoch=10, pop_size=50)
# model = DE.L_SHADE(problem, epoch=10, pop_size=50)
# model = DE.SAP_DE(problem, epoch=10, pop_size=50)
# model = EP.BaseEP(problem, epoch=10, pop_size=50)
# model = EP.LevyEP(problem, epoch=10, pop_size=50)
# model = ES.BaseES(problem, epoch=10, pop_size=50)
# model = ES.LevyES(problem, epoch=10, pop_size=50)
# model = FPA.BaseFPA(problem, epoch=10, pop_size=50)
# model = GA.BaseGA(problem, epoch=10, pop_size=50)
# model = MA.BaseMA(problem, epoch=10, pop_size=50)

# model = BRO.OriginalBRO(problem, epoch=10, pop_size=50)
# model = BRO.BaseBRO(problem, epoch=10, pop_size=50, pr=0.03)
# model = BSO.BaseBSO(problem, epoch=10, pop_size=50)
# model = BSO.ImprovedBSO(problem, epoch=10, pop_size=50)
# model = CA.OriginalCA(problem, epoch=10, pop_size=50)
# model = CHIO.BaseCHIO(problem, epoch=10, pop_size=50)
# model = CHIO.OriginalCHIO(problem, epoch=10, pop_size=50, pr=0.03)
# model = GSKA.BaseGSKA(problem, epoch=10, pop_size=50)
# model = GSKA.OriginalGSKA(problem, epoch=10, pop_size=50)
# model = FBIO.OriginalFBIO(problem, epoch=10, pop_size=50)
# model = FBIO.BaseFBIO(problem, epoch=10, pop_size=50)
# model = ICA.BaseICA(problem, epoch=10, pop_size=50)
# model = LCO.BaseLCO(problem, epoch=10, pop_size=50)
# model = LCO.ImprovedLCO(problem, epoch=10, pop_size=50, pr=0.03)
# model = LCO.OriginalLCO(problem, epoch=10, pop_size=50)
# model = QSA.BaseQSA(problem, epoch=10, pop_size=50)
# model = QSA.OppoQSA(problem, epoch=10, pop_size=50)
# model = QSA.LevyQSA(problem, epoch=10, pop_size=50)
# model = QSA.ImprovedQSA(problem, epoch=10, pop_size=50, pr=0.03)
# model = QSA.OriginalQSA(problem, epoch=10, pop_size=50)
# model = SARO.BaseSARO(problem, epoch=10, pop_size=50)
# model = SARO.OriginalSARO(problem, epoch=10, pop_size=50)
# model = SSDO.BaseSSDO(problem, epoch=10, pop_size=50)
# model = TLO.BaseTLO(problem, epoch=10, pop_size=50)
# model = TLO.OriginalTLO(problem, epoch=10, pop_size=50)
# model = TLO.ITLO(problem, epoch=10, pop_size=50)

# model = AOA.OriginalAOA(problem, epoch=10, pop_size=50)
# model = CGO.OriginalCGO(problem, epoch=100, pop_size=50)
# model = GBO.OriginalGBO(problem, epoch=10, pop_size=50, pr=0.03)
# model = HC.OriginalHC(problem, epoch=10, pop_size=50)
# model = HC.BaseHC(problem, epoch=10, pop_size=50)
# model = SCA.OriginalSCA(problem, epoch=10, pop_size=50)
# model = SCA.BaseSCA(problem, epoch=10, pop_size=50)

# model = HS.OriginalHS(problem, epoch=10, pop_size=50)
# model = HS.BaseHS(problem, epoch=10, pop_size=50)

# model = ArchOA.OriginalArchOA(problem, epoch=10, pop_size=50)
# model = ASO.BaseASO(problem, epoch=10, pop_size=50)
# model = EFO.OriginalEFO(problem, epoch=10, pop_size=50)
# model = EFO.BaseEFO(problem, epoch=10, pop_size=50)
# model = EO.BaseEO(problem, epoch=10, pop_size=50)
# model = EO.ModifiedEO(problem, epoch=10, pop_size=50)
# model = EO.AdaptiveEO(problem, epoch=10, pop_size=50)
# model = HGSO.BaseHGSO(problem, epoch=10, pop_size=50)
# model = MVO.OriginalMVO(problem, epoch=10, pop_size=50)
# model = MVO.BaseMVO(problem, epoch=10, pop_size=50)
# model = NRO.BaseNRO(problem, epoch=10, pop_size=50)
# model = SA.BaseSA(problem, epoch=10, pop_size=50)
# model = TWO.BaseTWO(problem, epoch=10, pop_size=50)
# model = TWO.OppoTWO(problem, epoch=10, pop_size=50)
# model = TWO.LevyTWO(problem, epoch=10, pop_size=50)
# model = TWO.EnhancedTWO(problem, epoch=10, pop_size=50)
# model = WDO.BaseWDO(problem, epoch=10, pop_size=50)

# model = CEM.BaseCEM(problem, epoch=10, pop_size=50)

# model = AEO.OriginalAEO(problem, epoch=10, pop_size=50)
# model = AEO.ModifiedAEO(problem, epoch=10, pop_size=50)
# model = AEO.AdaptiveAEO(problem, epoch=10, pop_size=50)
# model = AEO.EnhancedAEO(problem, epoch=10, pop_size=50)
# model = AEO.IAEO(problem, epoch=10, pop_size=50)
# model = GCO.OriginalGCO(problem, epoch=10, pop_size=50)
# model = GCO.BaseGCO(problem, epoch=10, pop_size=50)
# model = WCA.BaseWCA(problem, epoch=10, pop_size=50)

# model = ABC.BaseABC(problem, epoch=10, pop_size=50)
# model = ACOR.BaseACOR(problem, epoch=10, pop_size=50)
# model = ALO.OriginalALO(problem, epoch=10, pop_size=50)
# model = ALO.BaseALO(problem, epoch=10, pop_size=50)
# model = AO.OriginalAO(problem, epoch=10, pop_size=50)
# model = BA.BaseBA(problem, epoch=10, pop_size=50)
# model = BA.OriginalBA(problem, epoch=10, pop_size=50)
# model = BA.ModifiedBA(problem, epoch=10, pop_size=50)
# model = BeesA.BaseBeesA(problem, epoch=10, pop_size=50)
# model = BeesA.ProbBeesA(problem, epoch=10, pop_size=50)
# model = BES.BaseBES(problem, epoch=10, pop_size=50)
# model = BFO.OriginalBFO(problem, epoch=10, pop_size=50)
# model = BFO.ABFO(problem, epoch=10, pop_size=50)
# model = BSA.BaseBSA(problem, epoch=10, pop_size=50)
# model = COA.BaseCOA(problem, epoch=10, pop_size=50)
# model = CSA.BaseCSA(problem, epoch=10, pop_size=50)
# model = CSO.BaseCSO(problem, epoch=10, pop_size=50)
# model = DO.BaseDO(problem, epoch=10, pop_size=50)
# model = EHO.BaseEHO(problem, epoch=10, pop_size=50)
# model = FA.BaseFA(problem, epoch=10, pop_size=50)
# model = FFA.BaseFFA(problem, epoch=10, pop_size=50)
# model = FOA.OriginalFOA(problem, epoch=10, pop_size=50)
# model = FOA.BaseFOA(problem, epoch=10, pop_size=50)
# model = FOA.WhaleFOA(problem, epoch=10, pop_size=50)
# model = GOA.BaseGOA(problem, epoch=10, pop_size=50)
# model = GWO.BaseGWO(problem, epoch=10, pop_size=50)
# model = GWO.RW_GWO(problem, epoch=10, pop_size=50)
# model = HGS.OriginalHGS(problem, epoch=10, pop_size=50)
# model = HHO.BaseHHO(problem, epoch=10, pop_size=50)
# model = JA.OriginalJA(problem, epoch=10, pop_size=50)
# model = JA.BaseJA(problem, epoch=10, pop_size=50)
# model = JA.LevyJA(problem, epoch=10, pop_size=50)
# model = MFO.OriginalMFO(problem, epoch=10, pop_size=50)
# model = MFO.BaseMFO(problem, epoch=10, pop_size=50)
# model = MRFO.BaseMRFO(problem, epoch=10, pop_size=50)
# model = MSA.BaseMSA(problem, epoch=10, pop_size=50)
# model = NMRA.BaseNMRA(problem, epoch=10, pop_size=50)
# model = NMRA.ImprovedNMRA(problem, epoch=10, pop_size=50)
# model = PFA.BasePFA(problem, epoch=10, pop_size=50)
# model = PSO.BasePSO(problem, epoch=10, pop_size=50)
# model = PSO.C_PSO(problem, epoch=10, pop_size=50)
# model = PSO.CL_PSO(problem, epoch=10, pop_size=50)
# model = PSO.PPSO(problem, epoch=10, pop_size=50)
# model = PSO.HPSO_TVAC(problem, epoch=10, pop_size=50)
# model = SFO.BaseSFO(problem, epoch=10, pop_size=50)
# model = SFO.ImprovedSFO(problem, epoch=10, pop_size=50)
# model = SHO.BaseSHO(problem, epoch=10, pop_size=50)
# model = SLO.BaseSLO(problem, epoch=10, pop_size=50)
# model = SLO.ModifiedSLO(problem, epoch=10, pop_size=50)
# model = SLO.ISLO(problem, epoch=10, pop_size=50)
# model = SRSR.BaseSRSR(problem, epoch=10, pop_size=50)
# model = SSA.OriginalSSA(problem, epoch=10, pop_size=50)
# model = SSA.BaseSSA(problem, epoch=10, pop_size=50)
# model = SSO.BaseSSO(problem, epoch=10, pop_size=50)
# model = SSpiderA.BaseSSpiderA(problem, epoch=10, pop_size=50)
# model = SSpiderO.BaseSSpiderO(problem, epoch=10, pop_size=50)
# model = WOA.BaseWOA(problem, epoch=10, pop_size=50)
# model = WOA.HI_WOA(problem, epoch=10, pop_size=50)

Citations

  • If this code is useful for you, please give me some credits, link to my first-author papers.
@software{thieu_nguyen_2020_3711949,
  author       = {Nguyen Van Thieu},
  title        = {A collection of the state-of-the-art MEta-heuristics ALgorithms in PYthon: Mealpy},
  month        = march,
  year         = 2020,
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.3711948},
  url          = {https://doi.org/10.5281/zenodo.3711948}
}