WebUI image visualization, error handle improved, result image amount…

… unlimited, bug fix
Mr-SGXXX · May 29, 2024 · 8670251 · 8670251
1 parent c8cbe4d
commit 8670251
Show file tree

Hide file tree

Showing 11 changed files with 97 additions and 31 deletions.
diff --git a/.gitignore b/.gitignore
@@ -3,4 +3,5 @@ build
 pyerm.egg-info
 __pycache__
 *.DS_Store
-.vscode
+.vscode
+*.baiduyun.*
diff --git a/README.md b/README.md
@@ -1,14 +1,14 @@
 # PyERM (Python Experiment Record Manager)
-This project is an experiment record manager for python based on SQLite DMS, which can help you efficiently save your experiment settings and results for later analysis. 
+This project is a general experiment record manager for python based on SQLite DMS, which can help you efficiently save your experiment settings and results for later analysis. 
 
 *In the current version, all operations will be performed locally.*
 
 # Introduction
-This project is used to save the settings and results of any experiment consists of three parts: method, data, task. 
+This project is used to save the settings and results of any experiment which consists of three parts: method, data, task. 
 
-Besides, the basic information and detail information of the experiment will also be recorded.
+Besides, the basic information and detail information of the experiment can also be recorded.
 
-All data you want can be efficiently saved by API provided without knowing the detail implement, but I suggest reading the table introduction for further dealing with the records. 
+All data you want can be efficiently saved by API provided without knowing the project detail implement, but I suggest reading the table introduction for further dealing with the records. 
 
 ## Install Introduction
 All you need to do for using the python package is using the following command:
@@ -19,7 +19,7 @@ All you need to do for using the python package is using the following command:
 ### Table Define & Init
 Before starting the experiment, you need to init the tables you need for the experiment by three init function: `data_init()`, `method_init()`, `task_init()`.
 
- You need to input the name and experiment parameter for the first two. The function can automatically detect the data type, and they will create the table if not exist. If you want to define the DMS type yourself, you can input a `param_def_dict` to these function, whose key means column name, and value means column SQL type define, like `{"people", "TEXT DEFAULT NULL"}`. 
+ You need to input the name and experiment parameter for the first two. The function can automatically detect the data type from input dict, like `{"name: "Alice", "age": 20}`, and they will create the table if not exist. If you want to define the DMS type yourself, you can input a `param_def_dict` to these function, whose key means column name, and value means column SQL type define, like `{"name", "TEXT DEFAULT NULL", "age": "INTEGER DEFAULT 20"}`. 
 
 ### Experiment 
 
@@ -33,19 +33,28 @@ The experiment recorder mainly consists of four parts, `experiment_start()`, `ex
 
 `detail_update()` saves the intermediate results. It's optional, and if you never use it and don't manually set the define dict, the detail table may not be created.
 
+you can see a specific example in the [github repositories of this project](https://github.com/Mr-SGXXX/pyerm/tree/master/examples) 
+
 
 ## Scripts Introduction
-### export_xls 
-Export the content of a SQLite database to an Excel file
+### export_zip 
+Export the content of a SQLite database to an Excel file and the result images (if exists) in a zip
 ```shell
-export_xls db_path(default ~/experiment.db) output_path(default ./experiment_record.xls)
+export_zip db_path(default ~/experiment.db) output_dir(default ./)
 ```
 ### db_merge 
-Merge two SQLite databases.
+Merge two SQLite databases. The two database must have the same structure for current version.
 ```shell
 db_merge db_path_destination db_path_source
 ```
 
+### pyerm_webui
+Open the WebUI of pyerm, and other devices in the network can also access it for remote check. 
+In the WebUI, you can see all the table of the database including the images of result table or use SQL to get what you want to see. 
+Besides, the WebUI also offers a way to download the zip the same as `export_zip` or the raw db file. 
+```shell
+pyerm_webui
+```
 
 ## Table Introduction
 
@@ -65,21 +74,22 @@ The only necessary column for method table is the data setting id, which will be
 ### Result Table
 Each Result Table is identified by its corresponding task name, and different tasks will be assigned with different tables for saving its different experiment results, such as accuracy for classification, normalized mutual information for clustering. 
 
-Besides, this table offers several columns for saving image in order for latter visualization. 
+Besides, this table can save the result images without amount limit in the code. 
 
 The only necessary column for result table is the experiment id, other specific column is set by users.
 
 ### Detail Table
-Each Detail Table is identified by its corresponding method name, different methods are related to different detail table. During an experiment, you may need to record some intermediate results, which can be saved in this table.
+Each Detail Table is identified by its corresponding method name, different methods are related to different detail table. During an experiment, you may need to record some intermediate results, such as epoch&loss for deep learning, which can be saved in this table.
 
 The only necessary column for detail table is the detail id (which can be set automatically) and the experiment id, other specific column is set by users.
 
 
 # Future Plan
 
-- [ ] Some Scripts For Better Usage  
+- [x] Web UI Visualization 
 - [ ] Experiment Summary Report Generate
-- [ ] Web UI Visualize & Commonly Used Analyze Fuctions
+- [ ] Commonly Used Analyze Fuctions
+- [ ] Bug fix & performence improving
 
 # Contact
 My email is [email protected]. If you have any question or advice, please contact me. 
diff --git a/examples/example.ipynb b/examples/example.ipynb
@@ -219,7 +219,25 @@
    "cell_type": "code",
    "execution_count": 7,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[0m\n",
+      "\u001b[34m\u001b[1m  You can now view your Streamlit app in your browser.\u001b[0m\n",
+      "\u001b[0m\n",
+      "\u001b[34m  Local URL: \u001b[0m\u001b[1mhttp://localhost:8503\u001b[0m\n",
+      "\u001b[34m  Network URL: \u001b[0m\u001b[1mhttp://172.20.183.170:8503\u001b[0m\n",
+      "\u001b[0m\n",
+      "\u001b[34m\u001b[1m  For better performance, install the Watchdog module:\u001b[0m\n",
+      "\n",
+      "  $ xcode-select --install\n",
+      "  $ pip install watchdog\n",
+      "            \u001b[0m\n"
+     ]
+    }
+   ],
    "source": [
     "!pyerm_webui"
    ]

diff --git a/examples/example.py b/examples/example.py
@@ -81,7 +81,7 @@ def demo_experiment(exp: pyerm.Experiment):
     ax.set_title('KMeans Clustering of 2D Gaussian Data')
     ax.set_xlabel('X1')
     ax.set_ylabel('X2')
-
+    # raise RuntimeError('This is a demo error to test the error handling function.')
     buf = BytesIO()
     fig.savefig(buf, format='png')
     buf.seek(0)

diff --git a/pyerm/database/experiment.py b/pyerm/database/experiment.py
@@ -25,7 +25,8 @@
 import os
 import typing
 from PIL import Image
-import atexit
+import traceback
+import sys
 from copy import deepcopy
 
 from .dbbase import Database
@@ -115,6 +116,10 @@ def experiment_start(self, description:str=None, start_time:float=None, tags:typ
             The experiment ID
 
         """
+        def handle_exception(exc_type, exc_value, exc_traceback):
+            error_info = "".join(traceback.format_exception(exc_type, exc_value, exc_traceback))
+            self.experiment_table.experiment_failed(self._id, error_info)
+            sys.__excepthook__(exc_type, exc_value, exc_traceback)
         assert self._data is not None, 'Data not initialized, run data_init() first'
         assert self._method is not None, 'Method not initialized, run method_init() first'
         assert self._task is not None, 'Task not initialized, run task_init() first'
@@ -124,7 +129,7 @@ def experiment_start(self, description:str=None, start_time:float=None, tags:typ
             experimenters = ','.join(experimenters)
         self._id = self.experiment_table.experiment_start(description, self._method, self._method_id, self._data, self._data_id, self._task, start_time, tags, experimenters)
         self.run_times += 1
-        atexit.register(self.experiment_table.experiment_failed, self._id)
+        sys.excepthook = handle_exception
         return self._id
 
     def experiment_over(self, rst_dict:typing.Dict[str, typing.Any], image_dict:typing.Dict[str, typing.Union[Image.Image, str, bytearray, bytes]]={}, end_time:float=None, useful_time_cost:float=None) -> None:
@@ -158,7 +163,7 @@ def experiment_over(self, rst_dict:typing.Dict[str, typing.Any], image_dict:typi
         self.rst_table.record_image(self._id, **image_dict)
         self.experiment_table.experiment_over(self._id, end_time=end_time, useful_time_cost=useful_time_cost)
         self._id = None
-        atexit.unregister(self.experiment_table.experiment_failed)
+        sys.excepthook = sys.__excepthook__
 
 
     def experiment_failed(self, error_info:str, end_time:float=None) -> None:
@@ -316,3 +321,5 @@ def auto_detect_def(param_dict:typing.Dict[str, typing.Any]) -> typing.Dict[str,
             except:
                 raise TypeError(f'Unsupported type for DB: {type(v)}, consider to convert it to str or bytes.')
     return param_def_dict
+
+
diff --git a/pyerm/database/tables.py b/pyerm/database/tables.py
@@ -28,6 +28,7 @@
 from time import strftime, time, localtime
 import typing
 import traceback
+import sys
 
 from .dbbase import Table, Database
 
@@ -71,8 +72,10 @@ def experiment_failed(self, experiment_id:int, error_info:str=None, end_time:flo
             end_time = time()
         end_time = localtime(end_time)
         end_time = strftime("%Y-%m-%d %H:%M:%S", end_time)
+        print(error_info)
         if error_info is None:
             error_info = traceback.format_exc()
+            print(error_info)
         super().update(f"id={experiment_id}", end_time=strftime(end_time), status='failed', failed_reason=error_info)
 
     def get_experiment(self, experiment_id:int) -> dict:
@@ -141,7 +144,7 @@ def image_def(i):
     return {f'image_{i}_name': 'TEXT DEFAULT NULL', f'image_{i}': 'BLOB DEFAULT NULL'}
 
 class ResultTable(Table):
-    def __init__(self, db: Database, task: str, rst_def_dict: dict=None, default_image_num: int=10) -> None:
+    def __init__(self, db: Database, task: str, rst_def_dict: dict=None, default_image_num: int=2) -> None:
         columns = {
             'experiment_id': 'INTEGER PRIMARY KEY AUTOINCREMENT',
             **rst_def_dict,
@@ -165,6 +168,7 @@ def record_image(self, experiment_id:int, **image_dict:typing.Dict[str, typing.U
             if i > self.max_image_num:
                 self.add_column(f'image_{i}_name', 'TEXT DEFAULT NULL')
                 self.add_column(f'image_{i}', 'BLOB DEFAULT NULL')
+                self.max_image_num += 1
             if isinstance(image_dict[image_key], Image.Image):
                 image = BytesIO()
                 image_dict[image_key].save(image, format='PNG')

diff --git a/pyerm/scripts/db_merge.py b/pyerm/scripts/db_merge.py
@@ -48,7 +48,7 @@ def merge_db(db_path1:str, db_path2:str):
         copy_table(db1, db2, table_name)
 
 def main():
-    parser = argparse.ArgumentParser(description='Merge two SQLite databases.')
+    parser = argparse.ArgumentParser(description='Merge two SQLite databases. For now, the merged two databases must have the same schema.')
     parser.add_argument('db_path_destination', type=str, help='Destination database file path.')
     parser.add_argument('db_path_source', type=str, help='Source database file path.')
     args = parser.parse_args()

diff --git a/pyerm/scripts/export_data.py b/pyerm/scripts/export_data.py
@@ -28,6 +28,7 @@
 import sqlite3
 import argparse
 import os
+import shutil
 from zipfile import ZipFile
 
 USER_HOME = os.path.expanduser('~')
@@ -50,7 +51,7 @@ def export_data(db_path:str, output_dir:str):
         df = pd.read_sql_query(f"SELECT * FROM {table_name}", conn)
         if table_name.startswith("result_"):
             for col in df.columns:
-                if col.startswith("image_") and not col.endswith("_name"):
+                if col.startswith("image_") and not col.endswith("_name") and not df[f"{col}_name"].isnull().all():
                     img_paths = []
                     for i, row in df.iterrows():
                         img_data = row[col]
@@ -88,14 +89,14 @@ def zip_dir(dir_path:str, zip_path:str, remove_original=False):
 
                 print(file_path)
     if remove_original:
-        os.rmdir(dir_path)
+        shutil.rmtree(dir_path)
 
 
 
 def main():
     parser = argparse.ArgumentParser(description="Export the content of a SQLite database to an Excel file")
     parser.add_argument('db_path', type=str, nargs='?', default=None, help='The path of the database file')
-    parser.add_argument('output_dir', type=str, nargs='?', default="~/experiment_record", help='The dir path of the output file')
+    parser.add_argument('output_dir', type=str, nargs='?', default="./", help='The dir path of the output file')
     args = parser.parse_args()
     if args.db_path is None:
         args.db_path = os.path.join(USER_HOME, 'experiment.db')

diff --git a/pyerm/webUI/home.py b/pyerm/webUI/home.py
@@ -40,7 +40,7 @@ def home():
         st.markdown('Export Experiment Data')
         if st.checkbox('Download Excel & Result Images as ZIP'):
             download_zip()
-        elif st.checkbox('Download raw db file'):
+        if st.checkbox('Download raw db file'):
             download_db()
 
 
@@ -56,7 +56,7 @@ def title():
     st.markdown(f"**Disclaimer**: This is a demo version. The actual version is not available yet.")
 
 def load_db():
-    st.markdown('## Load Database')
+    st.markdown('## Load Database (PyERM only supports local SQLite database for now)')
     db_path = st.text_input("Database Path", value=st.session_state.db_path)
     if st.button('Change Database Path'):
         st.session_state.db_path = db_path
@@ -88,7 +88,7 @@ def download_zip():
     st.download_button(
             label="Download Excel&Images as ZIP",
             data=st.session_state.zip,
-            file_name=f"{os.path.basename(st.session_state.db_path)}.zip",
+            file_name=f"{os.path.basename(os.path.splitext(st.session_state.db_path)[0])}.zip",
             mime="application/zip"
         )
 

diff --git a/pyerm/webUI/tables.py b/pyerm/webUI/tables.py
@@ -23,8 +23,12 @@
 # Version: 0.2.4
 
 import pandas as pd
+from PIL import Image
+import base64
+from io import BytesIO
 import streamlit as st
 import os
+import re
 
 from pyerm.database.dbbase import Database
 
@@ -46,6 +50,15 @@ def detect_tables():
     st.session_state.table_name = table_name
 
 def select_tables():
+    def image_to_base64(img):
+        buffered = BytesIO(img)
+        img_str = base64.b64encode(buffered.getvalue()).decode()
+        return img_str
+
+    def make_image_clickable(image_name, image):
+        img_str = image_to_base64(image)
+        return f'<a href="data:image/jpeg;base64,{img_str}" target="_blank" title="{image_name}"><img src="data:image/jpeg;base64,{img_str}" width="100"></a>'
+
     db = Database(st.session_state.db_path, output_info=False)
     table_name = st.session_state.table_name
     if st.session_state.sql is not None:
@@ -62,16 +75,29 @@ def select_tables():
         columns = [column[0] for column in db.cursor.description]
         df = pd.DataFrame(data, columns=columns)
     columns_keep = [col for col in df.columns if not col.startswith("image_")]
+    pattern = re.compile(r'image_(\d+)')
+    max_image_num = -1
+    for name in df.columns:
+        match = pattern.match(name)
+        if match:
+            max_image_num = max(max_image_num, int(match.group(1)))
+    for i in range(max_image_num+1):
+        if f'image_{i}' in df.columns and not df[f'image_{i}_name'].isnull().all():
+            df[f'image_{i}'] = df.apply(lambda x: make_image_clickable(x[f'image_{i}_name'], x[f'image_{i}']), axis=1)
+            columns_keep.append(f'image_{i}')
     df = df[columns_keep]
-
     st.write('## Table:', table_name)
-    st.dataframe(df)
+    st.write(df.to_html(escape=False, columns=columns_keep), unsafe_allow_html=True)
+
+    # st.dataframe(df[columns_keep])
+
 
 
 def input_sql():
     st.sidebar.write('You can also set the columns and condition for construct a select SQL sentense for the current table here.')
     condition = st.sidebar.text_input("Condition", value='', help='The condition for the select SQL sentense.')
     columns = st.sidebar.text_input("Columns", value='*', help='The columns for the select SQL sentense.')
+    st.session_state.table_name = st.sidebar.text_input("Table", value=st.session_state.table_name, help='The table, view or query for the select SQL sentense.')
     if st.sidebar.button('Run'):
         st.session_state.sql = f"SELECT {columns} FROM {st.session_state.table_name} WHERE {condition}" if condition else f"SELECT {columns} FROM {st.session_state.table_name}"
 

diff --git a/setup.py b/setup.py
@@ -20,7 +20,7 @@
 # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 # SOFTWARE.
 
-# Version: 0.2.3
+# Version: 0.2.4
 from setuptools import setup, find_packages
 
 with open("README.md", "r", encoding="utf-8") as f:
@@ -50,7 +50,6 @@
         ],
     },
     install_requires=[
-        "zipfile",
         "pandas",
         "pillow",
         "xlsxwriter",